comparison writeup/nips2010_submission.tex @ 517:0a5945249f2b

section 2, quick first pass
author Dumitru Erhan <dumitru.erhan@gmail.com>
date Tue, 01 Jun 2010 11:14:48 -0700
parents d057941417ed
children 460a4e78c9a4
comparison
equal deleted inserted replaced
511:d057941417ed 517:0a5945249f2b
155 {\bf Slant.} 155 {\bf Slant.}
156 We mimic slant by shifting each row of the image 156 We mimic slant by shifting each row of the image
157 proportionally to its height: $shift = round(slant \times height)$. 157 proportionally to its height: $shift = round(slant \times height)$.
158 The $slant$ coefficient can be negative or positive with equal probability 158 The $slant$ coefficient can be negative or positive with equal probability
159 and its value is randomly sampled according to the complexity level: 159 and its value is randomly sampled according to the complexity level:
160 e $slant \sim U[0,complexity]$, so the 160 $slant \sim U[0,complexity]$, so the
161 maximum displacement for the lowest or highest pixel line is of 161 maximum displacement for the lowest or highest pixel line is of
162 $round(complexity \times 32)$.\\ 162 $round(complexity \times 32)$.\\
163 {\bf Thickness.} 163 {\bf Thickness.}
164 Morphological operators of dilation and erosion~\citep{Haralick87,Serra82} 164 Morphological operators of dilation and erosion~\citep{Haralick87,Serra82}
165 are applied. The neighborhood of each pixel is multiplied 165 are applied. The neighborhood of each pixel is multiplied
185 variability of the transformation: $a$ and $d$ $\sim U[1-3 \times 185 variability of the transformation: $a$ and $d$ $\sim U[1-3 \times
186 complexity,1+3 \times complexity]$, $b$ and $e$ $\sim[-3 \times complexity,3 186 complexity,1+3 \times complexity]$, $b$ and $e$ $\sim[-3 \times complexity,3
187 \times complexity]$ and $c$ and $f$ $\sim U[-4 \times complexity, 4 \times 187 \times complexity]$ and $c$ and $f$ $\sim U[-4 \times complexity, 4 \times
188 complexity]$.\\ 188 complexity]$.\\
189 {\bf Local Elastic Deformations.} 189 {\bf Local Elastic Deformations.}
190 This filter induces a "wiggly" effect in the image, following~\citet{SimardSP03-short}, 190 This filter induces a ``wiggly'' effect in the image, following~\citet{SimardSP03-short},
191 which provides more details. 191 which provides more details.
192 Two "displacements" fields are generated and applied, for horizontal 192 Two ``displacements'' fields are generated and applied, for horizontal
193 and vertical displacements of pixels. 193 and vertical displacements of pixels.
194 To generate a pixel in either field, first a value between -1 and 1 is 194 To generate a pixel in either field, first a value between -1 and 1 is
195 chosen from a uniform distribution. Then all the pixels, in both fields, are 195 chosen from a uniform distribution. Then all the pixels, in both fields, are
196 multiplied by a constant $\alpha$ which controls the intensity of the 196 multiplied by a constant $\alpha$ which controls the intensity of the
197 displacements (larger $\alpha$ translates into larger wiggles). 197 displacements (larger $\alpha$ translates into larger wiggles).
198 Each field is convoluted with a Gaussian 2D kernel of 198 Each field is convoluted with a Gaussian 2D kernel of
199 standard deviation $\sigma$. Visually, this results in a blur. 199 standard deviation $\sigma$. Visually, this results in a blur.
200 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times 200 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times
201 \sqrt[3]{complexity}$.\\ 201 \sqrt[3]{complexity}$.\\
202 {\bf Pinch.} 202 {\bf Pinch.}
203 This GIMP filter is named "Whirl and 203 This is a GIMP filter called ``Whirl and
204 pinch", but whirl was set to 0. A pinch is ``similar to projecting the image onto an elastic 204 pinch'', but whirl was set to 0. A pinch is ``similar to projecting the image onto an elastic
205 surface and pressing or pulling on the center of the surface''~\citep{GIMP-manual}. 205 surface and pressing or pulling on the center of the surface''~\citep{GIMP-manual}.
206 For a square input image, think of drawing a circle of 206 For a square input image, this is akin to drawing a circle of
207 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to 207 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to
208 that disk (region inside circle) will have its value recalculated by taking 208 that disk (region inside circle) will have its value recalculated by taking
209 the value of another "source" pixel in the original image. The position of 209 the value of another ``source'' pixel in the original image. The position of
210 that source pixel is found on the line that goes through $C$ and $P$, but 210 that source pixel is found on the line that goes through $C$ and $P$, but
211 at some other distance $d_2$. Define $d_1$ to be the distance between $P$ 211 at some other distance $d_2$. Define $d_1$ to be the distance between $P$
212 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times 212 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times
213 d_1$, where $pinch$ is a parameter to the filter. 213 d_1$, where $pinch$ is a parameter to the filter.
214 The actual value is given by bilinear interpolation considering the pixels 214 The actual value is given by bilinear interpolation considering the pixels
220 {\large\bf Injecting Noise} 220 {\large\bf Injecting Noise}
221 221
222 \vspace*{1mm} 222 \vspace*{1mm}
223 223
224 {\bf Motion Blur.} 224 {\bf Motion Blur.}
225 This GIMP filter is a ``linear motion blur'' in GIMP 225 This is a ``linear motion blur'' in GIMP
226 terminology, with two parameters, $length$ and $angle$. The value of 226 terminology, with two parameters, $length$ and $angle$. The value of
227 a pixel in the final image is the approximately mean value of the $length$ first pixels 227 a pixel in the final image is approximately the mean value of the $length$ first pixels
228 found by moving in the $angle$ direction. 228 found by moving in the $angle$ direction.
229 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.\\ 229 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.\\
230 {\bf Occlusion.} 230 {\bf Occlusion.}
231 This filter selects a random rectangle from an {\em occluder} character 231 Selects a random rectangle from an {\em occluder} character
232 images and places it over the original {\em occluded} character 232 images and places it over the original {\em occluded} character
233 image. Pixels are combined by taking the max(occluder,occluded), 233 image. Pixels are combined by taking the max(occluder,occluded),
234 closer to black. The corners of the occluder The rectangle corners 234 closer to black. The rectangle corners
235 are sampled so that larger complexity gives larger rectangles. 235 are sampled so that larger complexity gives larger rectangles.
236 The destination position in the occluded image are also sampled 236 The destination position in the occluded image are also sampled
237 according to a normal distribution (see more details in~\citet{ift6266-tr-anonymous}). 237 according to a normal distribution (see more details in~\citet{ift6266-tr-anonymous}).
238 It has has a probability of not being applied at all of 60\%.\\ 238 This filter has a probability of 60\% of not being applied.\\
239 {\bf Pixel Permutation.} 239 {\bf Pixel Permutation.}
240 This filter permutes neighbouring pixels. It selects first 240 This filter permutes neighbouring pixels. It selects first
241 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then 241 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then
242 sequentially exchanged to one other pixel in its $V4$ neighbourhood. Number 242 sequentially exchanged with one other pixel in its $V4$ neighbourhood. The number
243 of exchanges to the left, right, top, bottom are equal or does not differ 243 of exchanges to the left, right, top, bottom is equal or does not differ
244 from more than 1 if the number of selected pixels is not a multiple of 4. 244 from more than 1 if the number of selected pixels is not a multiple of 4.
245 It has has a probability of not being applied at all of 80\%.\\ 245 % TODO: The previous sentence is hard to parse
246 This filter has a probability of 80\% of not being applied.\\
246 {\bf Gaussian Noise.} 247 {\bf Gaussian Noise.}
247 This filter simply adds, to each pixel of the image independently, a 248 This filter simply adds, to each pixel of the image independently, a
248 noise $\sim Normal(0(\frac{complexity}{10})^2)$. 249 noise $\sim Normal(0(\frac{complexity}{10})^2)$.
249 It has has a probability of not being applied at all of 70\%.\\ 250 It has a probability of 70\% of not being applied.\\
250 {\bf Background Images.} 251 {\bf Background Images.}
251 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random 252 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random
252 background behind the letter. The background is chosen by first selecting, 253 background behind the letter. The background is chosen by first selecting,
253 at random, an image from a set of images. Then a 32$\times$32 sub-region 254 at random, an image from a set of images. Then a 32$\times$32 sub-region
254 of that image is chosen as the background image (by sampling position 255 of that image is chosen as the background image (by sampling position
278 computed from the following element-wise operation: $\frac{image + filtered 279 computed from the following element-wise operation: $\frac{image + filtered
279 image \times mask}{mask+1}$. 280 image \times mask}{mask+1}$.
280 This filter has a probability of not being applied at all of 75\%.\\ 281 This filter has a probability of not being applied at all of 75\%.\\
281 {\bf Scratches.} 282 {\bf Scratches.}
282 The scratches module places line-like white patches on the image. The 283 The scratches module places line-like white patches on the image. The
283 lines are heavily transformed images of the digit "1" (one), chosen 284 lines are heavily transformed images of the digit ``1'' (one), chosen
284 at random among five thousands such 1 images. The 1 image is 285 at random among five thousands such 1 images. The 1 image is
285 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times 286 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times
286 complexity)^2$, using bi-cubic interpolation, 287 complexity)^2$, using bi-cubic interpolation,
287 Two passes of a grey-scale morphological erosion filter 288 Two passes of a grey-scale morphological erosion filter
288 are applied, reducing the width of the line 289 are applied, reducing the width of the line