Mercurial > ift6266
comparison writeup/nips2010_submission.tex @ 517:0a5945249f2b
section 2, quick first pass
author | Dumitru Erhan <dumitru.erhan@gmail.com> |
---|---|
date | Tue, 01 Jun 2010 11:14:48 -0700 |
parents | d057941417ed |
children | 460a4e78c9a4 |
comparison
equal
deleted
inserted
replaced
511:d057941417ed | 517:0a5945249f2b |
---|---|
155 {\bf Slant.} | 155 {\bf Slant.} |
156 We mimic slant by shifting each row of the image | 156 We mimic slant by shifting each row of the image |
157 proportionally to its height: $shift = round(slant \times height)$. | 157 proportionally to its height: $shift = round(slant \times height)$. |
158 The $slant$ coefficient can be negative or positive with equal probability | 158 The $slant$ coefficient can be negative or positive with equal probability |
159 and its value is randomly sampled according to the complexity level: | 159 and its value is randomly sampled according to the complexity level: |
160 e $slant \sim U[0,complexity]$, so the | 160 $slant \sim U[0,complexity]$, so the |
161 maximum displacement for the lowest or highest pixel line is of | 161 maximum displacement for the lowest or highest pixel line is of |
162 $round(complexity \times 32)$.\\ | 162 $round(complexity \times 32)$.\\ |
163 {\bf Thickness.} | 163 {\bf Thickness.} |
164 Morphological operators of dilation and erosion~\citep{Haralick87,Serra82} | 164 Morphological operators of dilation and erosion~\citep{Haralick87,Serra82} |
165 are applied. The neighborhood of each pixel is multiplied | 165 are applied. The neighborhood of each pixel is multiplied |
185 variability of the transformation: $a$ and $d$ $\sim U[1-3 \times | 185 variability of the transformation: $a$ and $d$ $\sim U[1-3 \times |
186 complexity,1+3 \times complexity]$, $b$ and $e$ $\sim[-3 \times complexity,3 | 186 complexity,1+3 \times complexity]$, $b$ and $e$ $\sim[-3 \times complexity,3 |
187 \times complexity]$ and $c$ and $f$ $\sim U[-4 \times complexity, 4 \times | 187 \times complexity]$ and $c$ and $f$ $\sim U[-4 \times complexity, 4 \times |
188 complexity]$.\\ | 188 complexity]$.\\ |
189 {\bf Local Elastic Deformations.} | 189 {\bf Local Elastic Deformations.} |
190 This filter induces a "wiggly" effect in the image, following~\citet{SimardSP03-short}, | 190 This filter induces a ``wiggly'' effect in the image, following~\citet{SimardSP03-short}, |
191 which provides more details. | 191 which provides more details. |
192 Two "displacements" fields are generated and applied, for horizontal | 192 Two ``displacements'' fields are generated and applied, for horizontal |
193 and vertical displacements of pixels. | 193 and vertical displacements of pixels. |
194 To generate a pixel in either field, first a value between -1 and 1 is | 194 To generate a pixel in either field, first a value between -1 and 1 is |
195 chosen from a uniform distribution. Then all the pixels, in both fields, are | 195 chosen from a uniform distribution. Then all the pixels, in both fields, are |
196 multiplied by a constant $\alpha$ which controls the intensity of the | 196 multiplied by a constant $\alpha$ which controls the intensity of the |
197 displacements (larger $\alpha$ translates into larger wiggles). | 197 displacements (larger $\alpha$ translates into larger wiggles). |
198 Each field is convoluted with a Gaussian 2D kernel of | 198 Each field is convoluted with a Gaussian 2D kernel of |
199 standard deviation $\sigma$. Visually, this results in a blur. | 199 standard deviation $\sigma$. Visually, this results in a blur. |
200 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times | 200 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times |
201 \sqrt[3]{complexity}$.\\ | 201 \sqrt[3]{complexity}$.\\ |
202 {\bf Pinch.} | 202 {\bf Pinch.} |
203 This GIMP filter is named "Whirl and | 203 This is a GIMP filter called ``Whirl and |
204 pinch", but whirl was set to 0. A pinch is ``similar to projecting the image onto an elastic | 204 pinch'', but whirl was set to 0. A pinch is ``similar to projecting the image onto an elastic |
205 surface and pressing or pulling on the center of the surface''~\citep{GIMP-manual}. | 205 surface and pressing or pulling on the center of the surface''~\citep{GIMP-manual}. |
206 For a square input image, think of drawing a circle of | 206 For a square input image, this is akin to drawing a circle of |
207 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to | 207 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to |
208 that disk (region inside circle) will have its value recalculated by taking | 208 that disk (region inside circle) will have its value recalculated by taking |
209 the value of another "source" pixel in the original image. The position of | 209 the value of another ``source'' pixel in the original image. The position of |
210 that source pixel is found on the line that goes through $C$ and $P$, but | 210 that source pixel is found on the line that goes through $C$ and $P$, but |
211 at some other distance $d_2$. Define $d_1$ to be the distance between $P$ | 211 at some other distance $d_2$. Define $d_1$ to be the distance between $P$ |
212 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times | 212 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times |
213 d_1$, where $pinch$ is a parameter to the filter. | 213 d_1$, where $pinch$ is a parameter to the filter. |
214 The actual value is given by bilinear interpolation considering the pixels | 214 The actual value is given by bilinear interpolation considering the pixels |
220 {\large\bf Injecting Noise} | 220 {\large\bf Injecting Noise} |
221 | 221 |
222 \vspace*{1mm} | 222 \vspace*{1mm} |
223 | 223 |
224 {\bf Motion Blur.} | 224 {\bf Motion Blur.} |
225 This GIMP filter is a ``linear motion blur'' in GIMP | 225 This is a ``linear motion blur'' in GIMP |
226 terminology, with two parameters, $length$ and $angle$. The value of | 226 terminology, with two parameters, $length$ and $angle$. The value of |
227 a pixel in the final image is the approximately mean value of the $length$ first pixels | 227 a pixel in the final image is approximately the mean value of the $length$ first pixels |
228 found by moving in the $angle$ direction. | 228 found by moving in the $angle$ direction. |
229 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.\\ | 229 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.\\ |
230 {\bf Occlusion.} | 230 {\bf Occlusion.} |
231 This filter selects a random rectangle from an {\em occluder} character | 231 Selects a random rectangle from an {\em occluder} character |
232 images and places it over the original {\em occluded} character | 232 images and places it over the original {\em occluded} character |
233 image. Pixels are combined by taking the max(occluder,occluded), | 233 image. Pixels are combined by taking the max(occluder,occluded), |
234 closer to black. The corners of the occluder The rectangle corners | 234 closer to black. The rectangle corners |
235 are sampled so that larger complexity gives larger rectangles. | 235 are sampled so that larger complexity gives larger rectangles. |
236 The destination position in the occluded image are also sampled | 236 The destination position in the occluded image are also sampled |
237 according to a normal distribution (see more details in~\citet{ift6266-tr-anonymous}). | 237 according to a normal distribution (see more details in~\citet{ift6266-tr-anonymous}). |
238 It has has a probability of not being applied at all of 60\%.\\ | 238 This filter has a probability of 60\% of not being applied.\\ |
239 {\bf Pixel Permutation.} | 239 {\bf Pixel Permutation.} |
240 This filter permutes neighbouring pixels. It selects first | 240 This filter permutes neighbouring pixels. It selects first |
241 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then | 241 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then |
242 sequentially exchanged to one other pixel in its $V4$ neighbourhood. Number | 242 sequentially exchanged with one other pixel in its $V4$ neighbourhood. The number |
243 of exchanges to the left, right, top, bottom are equal or does not differ | 243 of exchanges to the left, right, top, bottom is equal or does not differ |
244 from more than 1 if the number of selected pixels is not a multiple of 4. | 244 from more than 1 if the number of selected pixels is not a multiple of 4. |
245 It has has a probability of not being applied at all of 80\%.\\ | 245 % TODO: The previous sentence is hard to parse |
246 This filter has a probability of 80\% of not being applied.\\ | |
246 {\bf Gaussian Noise.} | 247 {\bf Gaussian Noise.} |
247 This filter simply adds, to each pixel of the image independently, a | 248 This filter simply adds, to each pixel of the image independently, a |
248 noise $\sim Normal(0(\frac{complexity}{10})^2)$. | 249 noise $\sim Normal(0(\frac{complexity}{10})^2)$. |
249 It has has a probability of not being applied at all of 70\%.\\ | 250 It has a probability of 70\% of not being applied.\\ |
250 {\bf Background Images.} | 251 {\bf Background Images.} |
251 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random | 252 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random |
252 background behind the letter. The background is chosen by first selecting, | 253 background behind the letter. The background is chosen by first selecting, |
253 at random, an image from a set of images. Then a 32$\times$32 sub-region | 254 at random, an image from a set of images. Then a 32$\times$32 sub-region |
254 of that image is chosen as the background image (by sampling position | 255 of that image is chosen as the background image (by sampling position |
278 computed from the following element-wise operation: $\frac{image + filtered | 279 computed from the following element-wise operation: $\frac{image + filtered |
279 image \times mask}{mask+1}$. | 280 image \times mask}{mask+1}$. |
280 This filter has a probability of not being applied at all of 75\%.\\ | 281 This filter has a probability of not being applied at all of 75\%.\\ |
281 {\bf Scratches.} | 282 {\bf Scratches.} |
282 The scratches module places line-like white patches on the image. The | 283 The scratches module places line-like white patches on the image. The |
283 lines are heavily transformed images of the digit "1" (one), chosen | 284 lines are heavily transformed images of the digit ``1'' (one), chosen |
284 at random among five thousands such 1 images. The 1 image is | 285 at random among five thousands such 1 images. The 1 image is |
285 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times | 286 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times |
286 complexity)^2$, using bi-cubic interpolation, | 287 complexity)^2$, using bi-cubic interpolation, |
287 Two passes of a grey-scale morphological erosion filter | 288 Two passes of a grey-scale morphological erosion filter |
288 are applied, reducing the width of the line | 289 are applied, reducing the width of the line |