comparison writeup/nips2010_submission.tex @ 544:1cdfc17e890f

ca fitte maintenant
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Wed, 02 Jun 2010 10:33:37 -0400
parents 8aad1c6ec39a
children 316c7bdad5ad
comparison
equal deleted inserted replaced
541:8aad1c6ec39a 544:1cdfc17e890f
225 {\large\bf Injecting Noise} 225 {\large\bf Injecting Noise}
226 226
227 \vspace*{0.5mm} 227 \vspace*{0.5mm}
228 228
229 {\bf Motion Blur.} 229 {\bf Motion Blur.}
230 This is a ``linear motion blur'' in GIMP 230 This is GIMP's ``linear motion blur''
231 terminology, with two parameters, $length$ and $angle$. The value of 231 with parameters $length$ and $angle$. The value of
232 a pixel in the final image is approximately the mean value of the $length$ first pixels 232 a pixel in the final image is approximately the mean value of the $length$ first pixels
233 found by moving in the $angle$ direction. 233 found by moving in the $angle$ direction.
234 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$. 234 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.
235 \vspace*{-1mm} 235 \vspace*{-1mm}
236 236
237 {\bf Occlusion.} 237 {\bf Occlusion.}
238 Selects a random rectangle from an {\em occluder} character 238 Selects a random rectangle from an {\em occluder} character
239 images and places it over the original {\em occluded} character 239 image and places it over the original {\em occluded}
240 image. Pixels are combined by taking the max(occluder,occluded), 240 image. Pixels are combined by taking the max(occluder,occluded),
241 closer to black. The rectangle corners 241 closer to black. The rectangle corners
242 are sampled so that larger complexity gives larger rectangles. 242 are sampled so that larger complexity gives larger rectangles.
243 The destination position in the occluded image are also sampled 243 The destination position in the occluded image are also sampled
244 according to a normal distribution (see more details in~\citet{ift6266-tr-anonymous}). 244 according to a normal distribution (more details in~\citet{ift6266-tr-anonymous}).
245 This filter has a probability of 60\% of not being applied. 245 This filter is skipped with probability 60\%.
246 \vspace*{-1mm} 246 \vspace*{-1mm}
247 247
248 {\bf Pixel Permutation.} 248 {\bf Pixel Permutation.}
249 This filter permutes neighbouring pixels. It selects first 249 This filter permutes neighbouring pixels. It selects first
250 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then 250 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then
251 sequentially exchanged with one other pixel in its $V4$ neighbourhood. The number 251 sequentially exchanged with one other in as $V4$ neighbourhood.
252 of exchanges to the left, right, top, bottom is equal or does not differ 252 This filter is skipped with probability 80\%.
253 from more than 1 if the number of selected pixels is not a multiple of 4.
254 % TODO: The previous sentence is hard to parse
255 This filter has a probability of 80\% of not being applied.
256 \vspace*{-1mm} 253 \vspace*{-1mm}
257 254
258 {\bf Gaussian Noise.} 255 {\bf Gaussian Noise.}
259 This filter simply adds, to each pixel of the image independently, a 256 This filter simply adds, to each pixel of the image independently, a
260 noise $\sim Normal(0(\frac{complexity}{10})^2)$. 257 noise $\sim Normal(0(\frac{complexity}{10})^2)$.
261 It has a probability of 70\% of not being applied. 258 This filter is skipped with probability 70\%.
262 \vspace*{-1mm} 259 \vspace*{-1mm}
263 260
264 {\bf Background Images.} 261 {\bf Background Images.}
265 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random 262 Following~\citet{Larochelle-jmlr-2009}, this transformation adds a random
266 background behind the letter. The background is chosen by first selecting, 263 background behind the letter, from a randomly chosen natural image,
267 at random, an image from a set of images. Then a 32$\times$32 sub-region 264 with contrast adjustments depending on $complexity$, to preserve
268 of that image is chosen as the background image (by sampling position 265 more or less of the original character image.
269 uniformly while making sure not to cross image borders).
270 To combine the original letter image and the background image, contrast
271 adjustments are made. We first get the maximal values (i.e. maximal
272 intensity) for both the original image and the background image, $maximage$
273 and $maxbg$. We also have a parameter $contrast \sim U[complexity, 1]$.
274 Each background pixel value is multiplied by $\frac{max(maximage -
275 contrast, 0)}{maxbg}$ (higher contrast yield darker
276 background). The output image pixels are max(background,original).
277 \vspace*{-1mm} 266 \vspace*{-1mm}
278 267
279 {\bf Salt and Pepper Noise.} 268 {\bf Salt and Pepper Noise.}
280 This filter adds noise $\sim U[0,1]$ to random subsets of pixels. 269 This filter adds noise $\sim U[0,1]$ to random subsets of pixels.
281 The number of selected pixels is $0.2 \times complexity$. 270 The number of selected pixels is $0.2 \times complexity$.
282 This filter has a probability of not being applied at all of 75\%. 271 This filter is skipped with probability 75\%.
283 \vspace*{-1mm} 272 \vspace*{-1mm}
284 273
285 {\bf Spatially Gaussian Noise.} 274 {\bf Spatially Gaussian Noise.}
286 Different regions of the image are spatially smoothed. 275 Different regions of the image are spatially smoothed by convolving
287 The image is convolved with a symmetric Gaussian kernel of 276 the image is convolved with a symmetric Gaussian kernel of
288 size and variance chosen uniformly in the ranges $[12,12 + 20 \times 277 size and variance chosen uniformly in the ranges $[12,12 + 20 \times
289 complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized 278 complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized
290 between $0$ and $1$. We also create a symmetric averaging window, of the 279 between $0$ and $1$. We also create a symmetric averaging window, of the
291 kernel size, with maximum value at the center. For each image we sample 280 kernel size, with maximum value at the center. For each image we sample
292 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be 281 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be
293 averaging centers between the original image and the filtered one. We 282 averaging centers between the original image and the filtered one. We
294 initialize to zero a mask matrix of the image size. For each selected pixel 283 initialize to zero a mask matrix of the image size. For each selected pixel
295 we add to the mask the averaging window centered to it. The final image is 284 we add to the mask the averaging window centered to it. The final image is
296 computed from the following element-wise operation: $\frac{image + filtered 285 computed from the following element-wise operation: $\frac{image + filtered
297 image \times mask}{mask+1}$. 286 image \times mask}{mask+1}$.
298 This filter has a probability of not being applied at all of 75\%. 287 This filter is skipped with probability 75\%.
299 \vspace*{-1mm} 288 \vspace*{-1mm}
300 289
301 {\bf Scratches.} 290 {\bf Scratches.}
302 The scratches module places line-like white patches on the image. The 291 The scratches module places line-like white patches on the image. The
303 lines are heavily transformed images of the digit ``1'' (one), chosen 292 lines are heavily transformed images of the digit ``1'' (one), chosen
304 at random among five thousands such 1 images. The 1 image is 293 at random among 500 such 1 images,
305 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times 294 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times
306 complexity)^2$, using bi-cubic interpolation, 295 complexity)^2$, using bi-cubic interpolation.
307 Two passes of a grey-scale morphological erosion filter 296 Two passes of a grey-scale morphological erosion filter
308 are applied, reducing the width of the line 297 are applied, reducing the width of the line
309 by an amount controlled by $complexity$. 298 by an amount controlled by $complexity$.
310 This filter is only applied only 15\% of the time. When it is applied, 50\% 299 This filter is skipped with probability 85\%. The probabilities
311 of the time, only one patch image is generated and applied. In 30\% of 300 of applying 1, 2, or 3 patches are (50\%,30\%,20\%).
312 cases, two patches are generated, and otherwise three patches are
313 generated. The patch is applied by taking the maximal value on any given
314 patch or the original image, for each of the 32x32 pixel locations.
315 \vspace*{-1mm} 301 \vspace*{-1mm}
316 302
317 {\bf Grey Level and Contrast Changes.} 303 {\bf Grey Level and Contrast Changes.}
318 This filter changes the contrast and may invert the image polarity (white 304 This filter changes the contrast and may invert the image polarity (white
319 on black to black on white). The contrast $C$ is defined here as the 305 to black and black to white). The contrast is $C \sim U[1-0.85 \times complexity,1]$
320 difference between the maximum and the minimum pixel value of the image. 306 so the image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The
321 Contrast $\sim U[1-0.85 \times complexity,1]$ (so contrast $\geq 0.15$). 307 polarity is inverted with probability 50\%.
322 The image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The
323 polarity is inverted with $0.5$ probability.
324 308
325 \iffalse 309 \iffalse
326 \begin{figure}[ht] 310 \begin{figure}[ht]
327 \centerline{\resizebox{.9\textwidth}{!}{\includegraphics{images/example_t.png}}}\\ 311 \centerline{\resizebox{.9\textwidth}{!}{\includegraphics{images/example_t.png}}}\\
328 \caption{Illustration of the pipeline of stochastic 312 \caption{Illustration of the pipeline of stochastic