ift6266: writeup/techreport.tex comparison

comparison writeup/techreport.tex @ 441:1272dc84a30c

merge

author	Arnaud Bergeron <abergeron@gmail.com>
date	Mon, 03 May 2010 13:55:03 -0400
parents	89258bb41e4c bfa349f567e8
children	89a49dae6cf3

comparison

equal deleted inserted replaced

-:89258bb41e4c
+:1272dc84a30c
 In order to mimic a slant effect, we simply shift each row of the image proportionnaly to its height: $shift = round(slant \times height)$.
 We round the shift in order to have a discret displacement. We do not use a filter to smooth the result in order to save computing time
 and also because latter transformations have similar effects.
 The $slant$ coefficient can be negative or positive with equal probability and its value is randomly sampled according to the complexity level.
-In our case we take uniformly a number in the range $[0,complexity]$, that means, in our case, that the maximum displacement for the lowest
+In our case we take uniformly a number in the range $[0,complexity]$, so the maximum displacement for the lowest
 or highest pixel line is of $round(complexity \times 32)$.
 \subsection{Changing Thickness}
-To change the thickness of the characters we used morpholigical operators: dilation and erosion~\cite{Haralick87,Serra82}.i
+To change the thickness of the characters we used morpholigical operators: dilation and erosion~\cite{Haralick87,Serra82}.
 The basic idea of such transform is, for each pixel, to multiply in the element-wise manner its neighbourhood with a matrix called the structuring element.
 Then for dilation we remplace the pixel value by the maximum of the result, or the minimum for erosion.
-This will dilate or erode objects in the image and strength of the transform only depends on the structuring element.
+This will dilate or erode objects in the image and the strength of the transform only depends on the structuring element.
 We used ten different structural elements with increasing dimensions (the biggest is $5\times5$).
 for each image, we radomly sample the operator type (dilation or erosion) with equal probability and one structural element
 from a subset of the $n$ smallest structuring elements where $n$ is $round(10 \times complexity)$ for dilation and $round(6 \times complexity)$ for erosion.
 A neutral element is always present in the set, if it is chosen the transformation is not applied.
 Erosion allows only the six smallest structural elements because when the character is too thin it may erase it completly.
 \subsection{Affine Transformations}
 We generate an affine transform matrix according to the complexity level, then we apply it directly to the image.
-This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied
+The matrix is of size $2 \times 3$, so we can represent it by six parameters $(a,b,c,d,e,f)$.
-to the image is low enough not to confuse classes.
+Formally, for each pixel $(x,y)$ of the output image,
+we give the value of the pixel nearest to : $(ax+by+c,dx+ey+f)$, in the input image.
+This allows to produce scaling, translation, rotation and shearing variances.
+The sampling of the parameters $(a,b,c,d,e,f)$ have been tuned by hand to forbid important rotations (not to confuse classes) but to give good variability of the transformation. For each image we sample uniformly the parameters in the following ranges:
+$a$ and $d$ in $[1-3 \times complexity,1+3 \times complexity]$, $b$ and $e$ in $[-3 \times complexity,3 \times complexity]$ and $c$ and $f$ in $[-4 \times complexity, 4 \times complexity]$.
 \subsection{Local Elastic Deformations}
 This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in \cite{SimardSP03}.
 The general idea is to generate two "displacements" fields, for horizontal and vertical displacements of pixels. Each of these fields has the same size as the original image.

Mercurial > ift6266

comparison writeup/techreport.tex @ 441:1272dc84a30c