comparison writeup/techreport.tex @ 416:5f9d04dda707

Correction d'une erreur pour pinch et ajout d'une ref bibliographique
author fsavard
date Thu, 29 Apr 2010 18:26:30 -0400
parents 1e9788ce1680
children 0282882aa91f
comparison
equal deleted inserted replaced
415:1e9788ce1680 416:5f9d04dda707
91 This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied 91 This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied
92 to the image is low enough not to confuse classes. 92 to the image is low enough not to confuse classes.
93 93
94 \subsection{Local Elastic Deformations} 94 \subsection{Local Elastic Deformations}
95 95
96 This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in . 96 This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in \cite{SimardSP03}.
97 97
98 The general idea is to generate two "displacements" fields, for horizontal and vertical displacements of pixels. Each of these fields has the same size as the original image. 98 The general idea is to generate two "displacements" fields, for horizontal and vertical displacements of pixels. Each of these fields has the same size as the original image.
99 99
100 When generating the transformed image, we'll loop over the x and y positions in the fields and select, as a value, the value of the pixel in the original image at the (relative) position given by the displacement fields for this x and y. If the position we'd retrieve is outside the borders of the image, we use a 0 value instead. 100 When generating the transformed image, we'll loop over the x and y positions in the fields and select, as a value, the value of the pixel in the original image at the (relative) position given by the displacement fields for this x and y. If the position we'd retrieve is outside the borders of the image, we use a 0 value instead.
101 101
117 117
118 \subsection{Pinch} 118 \subsection{Pinch}
119 119
120 This is another GIMP filter we used. The filter is in fact named "Whirl and pinch", but we don't use the "whirl" part (whirl is set to 0). As described in GIMP, a pinch is "similar to projecting the image onto an elastic surface and pressing or pulling on the center of the surface". 120 This is another GIMP filter we used. The filter is in fact named "Whirl and pinch", but we don't use the "whirl" part (whirl is set to 0). As described in GIMP, a pinch is "similar to projecting the image onto an elastic surface and pressing or pulling on the center of the surface".
121 121
122 Mathematically, think of drawing a circle of radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to that disk (region inside circle) will have its value recalculated by taking the value of another "source" pixel in the original image. The position of that source pixel is found on the line thats goes through $C$ and $P$, but at some other distance $d_2$. Define $d_1$ to be the distance between $P$ and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch}$, where $pinch$ is a parameter to the filter. 122 Mathematically, for a square input image, think of drawing a circle of radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to that disk (region inside circle) will have its value recalculated by taking the value of another "source" pixel in the original image. The position of that source pixel is found on the line thats goes through $C$ and $P$, but at some other distance $d_2$. Define $d_1$ to be the distance between $P$ and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times d_1$, where $pinch$ is a parameter to the filter.
123
124 If the image is not square
123 125
124 The actual value is given by bilinear interpolation considering the pixels around the (non-integer) source position. 126 The actual value is given by bilinear interpolation considering the pixels around the (non-integer) source position.
125 127
126 The value for $pinch$ in our case was given by sampling from an uniform distribution over the range $[-complexity, 0.7 \times complexity]$. 128 The value for $pinch$ in our case was given by sampling from an uniform distribution over the range $[-complexity, 0.7 \times complexity]$.
127 129
134 136
135 These four sizes collectively define a window centered on the middle pixel of the occlusive image. This is the part that will be extracted as the occlusion. 137 These four sizes collectively define a window centered on the middle pixel of the occlusive image. This is the part that will be extracted as the occlusion.
136 138
137 The next step is to select a destination position in the occluded image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$ are selected according to Gaussian distributions of mean 0 and of standard deviations of, respectively, 3 and 2. Then an horizontal placement mode, $endroit$ (meaning location), is selected to be of three values meaning left, middle or right. 139 The next step is to select a destination position in the occluded image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$ are selected according to Gaussian distributions of mean 0 and of standard deviations of, respectively, 3 and 2. Then an horizontal placement mode, $endroit$ (meaning location), is selected to be of three values meaning left, middle or right.
138 140
139 If $endroit$ is "middle", the occlusion will be horizontally centered around the horizontal middle of the occluded image, then shifted according to $x_\arrivee$. If $endroit$ is "left", it will be placed on the left of the occluded image, then displaced right according to $x_\arrivee$. The contrary happens if $endroit$ is $right$. 141 If $endroit$ is "middle", the occlusion will be horizontally centered around the horizontal middle of the occluded image, then shifted according to $x\_arrivee$. If $endroit$ is "left", it will be placed on the left of the occluded image, then displaced right according to $x\_arrivee$. The contrary happens if $endroit$ is $right$.
140 142
141 In both the horizontal and vertical positionning, the maximum position in either direction is such that the selected occlusion won't go beyond the borders of the occluded image. 143 In both the horizontal and vertical positionning, the maximum position in either direction is such that the selected occlusion won't go beyond the borders of the occluded image.
142 144
143 This filter has a probability of not being applied, at all, of 60%. 145 This filter has a probability of not being applied, at all, of 60\%.
144 146
145 \subsection{Background Images} 147 \subsection{Background Images}
146 148
147 This transformation adds a random background behind the letter. The background is chosen by first selecting, at random, an image from a set of images. Then we choose a 32x32 subregion of that image as the background image (by sampling x and y positions uniformly while making sure not to cross image borders). 149 This transformation adds a random background behind the letter. The background is chosen by first selecting, at random, an image from a set of images. Then we choose a 32x32 subregion of that image as the background image (by sampling x and y positions uniformly while making sure not to cross image borders).
148 150