Mercurial > ift6266
comparison writeup/nips2010_submission.tex @ 499:2b58eda9fc08
changements de Myriam
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Tue, 01 Jun 2010 12:12:52 -0400 |
parents | 5764a2ae1fb5 |
children | 8479bf822d0e |
comparison
equal
deleted
inserted
replaced
495:5764a2ae1fb5 | 499:2b58eda9fc08 |
---|---|
30 human-level performance on both handwritten digit classification and | 30 human-level performance on both handwritten digit classification and |
31 62-class handwritten character recognition. For this purpose we | 31 62-class handwritten character recognition. For this purpose we |
32 developed a powerful generator of stochastic variations and noise | 32 developed a powerful generator of stochastic variations and noise |
33 processes character images, including not only affine transformations but | 33 processes character images, including not only affine transformations but |
34 also slant, local elastic deformations, changes in thickness, background | 34 also slant, local elastic deformations, changes in thickness, background |
35 images, color, contrast, occlusion, and various types of pixel and | 35 images, grey level changes, contrast, occlusion, and various types of pixel and |
36 spatially correlated noise. The out-of-distribution examples are | 36 spatially correlated noise. The out-of-distribution examples are |
37 obtained by training with these highly distorted images or | 37 obtained by training with these highly distorted images or |
38 by including object classes different from those in the target test set. | 38 by including object classes different from those in the target test set. |
39 \end{abstract} | 39 \end{abstract} |
40 \vspace*{-2mm} | 40 \vspace*{-2mm} |
275 This filter is only applied only 15\% of the time. When it is applied, 50\% | 275 This filter is only applied only 15\% of the time. When it is applied, 50\% |
276 of the time, only one patch image is generated and applied. In 30\% of | 276 of the time, only one patch image is generated and applied. In 30\% of |
277 cases, two patches are generated, and otherwise three patches are | 277 cases, two patches are generated, and otherwise three patches are |
278 generated. The patch is applied by taking the maximal value on any given | 278 generated. The patch is applied by taking the maximal value on any given |
279 patch or the original image, for each of the 32x32 pixel locations.\\ | 279 patch or the original image, for each of the 32x32 pixel locations.\\ |
280 {\bf Color and Contrast Changes.} | 280 {\bf Grey Level and Contrast Changes.} |
281 This filter changes the contrast and may invert the image polarity (white | 281 This filter changes the contrast and may invert the image polarity (white |
282 on black to black on white). The contrast $C$ is defined here as the | 282 on black to black on white). The contrast $C$ is defined here as the |
283 difference between the maximum and the minimum pixel value of the image. | 283 difference between the maximum and the minimum pixel value of the image. |
284 Contrast $\sim U[1-0.85 \times complexity,1]$ (so contrast $\geq 0.15$). | 284 Contrast $\sim U[1-0.85 \times complexity,1]$ (so contrast $\geq 0.15$). |
285 The image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The | 285 The image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The |
286 polarity is inverted with $0.5$ probability. | 286 polarity is inverted with $0.5$ probability. |
287 | 287 |
288 | 288 \iffalse |
289 \begin{figure}[h] | 289 \begin{figure}[h] |
290 \resizebox{.99\textwidth}{!}{\includegraphics{images/example_t.png}}\\ | 290 \resizebox{.99\textwidth}{!}{\includegraphics{images/example_t.png}}\\ |
291 \caption{Illustration of the pipeline of stochastic | 291 \caption{Illustration of the pipeline of stochastic |
292 transformations applied to the image of a lower-case t | 292 transformations applied to the image of a lower-case t |
293 (the upper left image). Each image in the pipeline (going from | 293 (the upper left image). Each image in the pipeline (going from |
294 left to right, first top line, then bottom line) shows the result | 294 left to right, first top line, then bottom line) shows the result |
295 of applying one of the modules in the pipeline. The last image | 295 of applying one of the modules in the pipeline. The last image |
296 (bottom right) is used as training example.} | 296 (bottom right) is used as training example.} |
297 \label{fig:pipeline} | 297 \label{fig:pipeline} |
298 \end{figure} | 298 \end{figure} |
299 | 299 \fi |
300 | 300 |
301 \begin{figure}[h] | 301 \begin{figure}[h] |
302 \resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ | 302 \resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ |
303 \caption{Illustration of each transformation applied alone to the same image | 303 \caption{Illustration of each transformation applied alone to the same image |
304 of an upper-case h (top left). First row (from left to right) : original image, slant, | 304 of an upper-case h (top left). First row (from left to right) : original image, slant, |
305 thickness, affine transformation, local elastic deformation; second row (from left to right) : | 305 thickness, affine transformation (translation, rotation, shear), |
306 local elastic deformation; second row (from left to right) : | |
306 pinch, motion blur, occlusion, pixel permutation, Gaussian noise; third row (from left to right) : | 307 pinch, motion blur, occlusion, pixel permutation, Gaussian noise; third row (from left to right) : |
307 background image, salt and pepper noise, spatially Gaussian noise, scratches, | 308 background image, salt and pepper noise, spatially Gaussian noise, scratches, |
308 color and contrast changes.} | 309 grey level and contrast changes.} |
309 \label{fig:transfo} | 310 \label{fig:transfo} |
310 \end{figure} | 311 \end{figure} |
311 | 312 |
312 | 313 |
313 \vspace*{-1mm} | 314 \vspace*{-1mm} |
318 the MNIST digits classification task~\citep{Hinton06,ranzato-07,Bengio-nips-2006,Salakhutdinov+Hinton-2009}, | 319 the MNIST digits classification task~\citep{Hinton06,ranzato-07,Bengio-nips-2006,Salakhutdinov+Hinton-2009}, |
319 with 60~000 examples, and variants involving 10~000 | 320 with 60~000 examples, and variants involving 10~000 |
320 examples~\cite{Larochelle-jmlr-toappear-2008,VincentPLarochelleH2008}, we want | 321 examples~\cite{Larochelle-jmlr-toappear-2008,VincentPLarochelleH2008}, we want |
321 to focus here on the case of much larger training sets, from 10 times to | 322 to focus here on the case of much larger training sets, from 10 times to |
322 to 1000 times larger. The larger datasets are obtained by first sampling from | 323 to 1000 times larger. The larger datasets are obtained by first sampling from |
323 a {\em data source} (NIST characters, scanned machine printed characters, characters | 324 a {\em data source}: {\bf NIST} (NIST database 19), {\bf Fonts}, {\bf Captchas}, |
324 from fonts, or characters from captchas) and then optionally applying some of the | 325 and {\bf OCR data} (scanned machine printed characters). Once a character |
325 above transformations and/or noise processes. | 326 is sampled from one of these sources (chosen randomly), a pipeline of |
327 the above transformations and/or noise processes is applied to the | |
328 image. | |
326 | 329 |
327 \vspace*{-1mm} | 330 \vspace*{-1mm} |
328 \subsection{Data Sources} | 331 \subsection{Data Sources} |
329 \vspace*{-1mm} | 332 \vspace*{-1mm} |
330 | 333 |