Mercurial > ift6266
diff writeup/techreport.tex @ 441:1272dc84a30c
merge
author | Arnaud Bergeron <abergeron@gmail.com> |
---|---|
date | Mon, 03 May 2010 13:55:03 -0400 |
parents | 89258bb41e4c bfa349f567e8 |
children | 89a49dae6cf3 |
line wrap: on
line diff
--- a/writeup/techreport.tex Mon May 03 12:18:03 2010 -0400 +++ b/writeup/techreport.tex Mon May 03 13:55:03 2010 -0400 @@ -84,16 +84,16 @@ and also because latter transformations have similar effects. The $slant$ coefficient can be negative or positive with equal probability and its value is randomly sampled according to the complexity level. -In our case we take uniformly a number in the range $[0,complexity]$, that means, in our case, that the maximum displacement for the lowest +In our case we take uniformly a number in the range $[0,complexity]$, so the maximum displacement for the lowest or highest pixel line is of $round(complexity \times 32)$. \subsection{Changing Thickness} -To change the thickness of the characters we used morpholigical operators: dilation and erosion~\cite{Haralick87,Serra82}.i +To change the thickness of the characters we used morpholigical operators: dilation and erosion~\cite{Haralick87,Serra82}. The basic idea of such transform is, for each pixel, to multiply in the element-wise manner its neighbourhood with a matrix called the structuring element. Then for dilation we remplace the pixel value by the maximum of the result, or the minimum for erosion. -This will dilate or erode objects in the image and strength of the transform only depends on the structuring element. +This will dilate or erode objects in the image and the strength of the transform only depends on the structuring element. We used ten different structural elements with increasing dimensions (the biggest is $5\times5$). for each image, we radomly sample the operator type (dilation or erosion) with equal probability and one structural element @@ -103,8 +103,14 @@ \subsection{Affine Transformations} We generate an affine transform matrix according to the complexity level, then we apply it directly to the image. -This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied -to the image is low enough not to confuse classes. +The matrix is of size $2 \times 3$, so we can represent it by six parameters $(a,b,c,d,e,f)$. +Formally, for each pixel $(x,y)$ of the output image, +we give the value of the pixel nearest to : $(ax+by+c,dx+ey+f)$, in the input image. +This allows to produce scaling, translation, rotation and shearing variances. + +The sampling of the parameters $(a,b,c,d,e,f)$ have been tuned by hand to forbid important rotations (not to confuse classes) but to give good variability of the transformation. For each image we sample uniformly the parameters in the following ranges: +$a$ and $d$ in $[1-3 \times complexity,1+3 \times complexity]$, $b$ and $e$ in $[-3 \times complexity,3 \times complexity]$ and $c$ and $f$ in $[-4 \times complexity, 4 \times complexity]$. + \subsection{Local Elastic Deformations} This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in \cite{SimardSP03}.