comparison writeup/nips2010_submission.tex @ 466:6205481bf33f

asking the questions
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Fri, 28 May 2010 17:39:22 -0600
parents 24f4a8b53fcc
children e0e57270b2af
comparison
equal deleted inserted replaced
465:a48601e8d431 466:6205481bf33f
87 layer) and try to reconstruct it from a corrupted version of it. After this 87 layer) and try to reconstruct it from a corrupted version of it. After this
88 unsupervised initialization, the stack of denoising auto-encoders can be 88 unsupervised initialization, the stack of denoising auto-encoders can be
89 converted into a deep supervised feedforward neural network and trained by 89 converted into a deep supervised feedforward neural network and trained by
90 stochastic gradient descent. 90 stochastic gradient descent.
91 91
92 In this paper we ask the following questions:
93 \begin{enumerate}
94 \item Do the good results previously obtained with deep architectures on the
95 MNIST digits generalize to the setting of a much larger and richer (but similar)
96 dataset, the NIST special database 19, with 62 classes and around 800k examples?
97 \item To what extent does the perturbation of input images (e.g. adding
98 noise, affine transformations, background images) make the resulting
99 classifier better not only on similarly perturbed images but also on
100 the {\em original clean examples}?
101 \item Do deep architectures benefit more from such {\em out-of-distribution}
102 examples, i.e. do they benefit more from the self-taught learning~\cite{RainaR2007} framework?
103 \item Similarly, does the feature learning step in deep learning algorithms benefit more
104 training with similar but different classes (i.e. a multi-task learning scenario) than
105 a corresponding shallow and purely supervised architecture?
106 \end{enumerate}
107 The experimental results presented here provide positive evidence towards all of these questions.
92 108
93 \section{Perturbation and Transformation of Character Images} 109 \section{Perturbation and Transformation of Character Images}
94 110
95 This section describes the different transformations we used to generate data, in their order. 111 This section describes the different transformations we used to generate data, in their order.
96 The code for these transformations (mostly python) is available at 112 The code for these transformations (mostly python) is available at