Mercurial > ift6266
comparison writeup/nips2010_submission.tex @ 466:6205481bf33f
asking the questions
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Fri, 28 May 2010 17:39:22 -0600 |
parents | 24f4a8b53fcc |
children | e0e57270b2af |
comparison
equal
deleted
inserted
replaced
465:a48601e8d431 | 466:6205481bf33f |
---|---|
87 layer) and try to reconstruct it from a corrupted version of it. After this | 87 layer) and try to reconstruct it from a corrupted version of it. After this |
88 unsupervised initialization, the stack of denoising auto-encoders can be | 88 unsupervised initialization, the stack of denoising auto-encoders can be |
89 converted into a deep supervised feedforward neural network and trained by | 89 converted into a deep supervised feedforward neural network and trained by |
90 stochastic gradient descent. | 90 stochastic gradient descent. |
91 | 91 |
92 In this paper we ask the following questions: | |
93 \begin{enumerate} | |
94 \item Do the good results previously obtained with deep architectures on the | |
95 MNIST digits generalize to the setting of a much larger and richer (but similar) | |
96 dataset, the NIST special database 19, with 62 classes and around 800k examples? | |
97 \item To what extent does the perturbation of input images (e.g. adding | |
98 noise, affine transformations, background images) make the resulting | |
99 classifier better not only on similarly perturbed images but also on | |
100 the {\em original clean examples}? | |
101 \item Do deep architectures benefit more from such {\em out-of-distribution} | |
102 examples, i.e. do they benefit more from the self-taught learning~\cite{RainaR2007} framework? | |
103 \item Similarly, does the feature learning step in deep learning algorithms benefit more | |
104 training with similar but different classes (i.e. a multi-task learning scenario) than | |
105 a corresponding shallow and purely supervised architecture? | |
106 \end{enumerate} | |
107 The experimental results presented here provide positive evidence towards all of these questions. | |
92 | 108 |
93 \section{Perturbation and Transformation of Character Images} | 109 \section{Perturbation and Transformation of Character Images} |
94 | 110 |
95 This section describes the different transformations we used to generate data, in their order. | 111 This section describes the different transformations we used to generate data, in their order. |
96 The code for these transformations (mostly python) is available at | 112 The code for these transformations (mostly python) is available at |