# HG changeset patch # User Dumitru Erhan # Date 1275415449 25200 # Node ID d057941417ed1518741380d202ea55ba41bed1d3 # Parent 8c2ab4f246b19be8d62c02fa4b2ff50a9f35242e a few changes in the first section diff -r 8c2ab4f246b1 -r d057941417ed writeup/nips2010_submission.tex --- a/writeup/nips2010_submission.tex Tue Jun 01 10:59:47 2010 -0700 +++ b/writeup/nips2010_submission.tex Tue Jun 01 11:04:09 2010 -0700 @@ -20,7 +20,7 @@ Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple - non-linear transformations. The self-taught learning (exploiting unlabeled + non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution @@ -74,8 +74,8 @@ performed similarly or better than previously proposed Restricted Boltzmann Machines in terms of unsupervised extraction of a hierarchy of features useful for classification. The principle is that each layer starting from -the bottom is trained to encode their input (the output of the previous -layer) and try to reconstruct it from a corrupted version of it. After this +the bottom is trained to encode its input (the output of the previous +layer) and to reconstruct it from a corrupted version of it. After this unsupervised initialization, the stack of denoising auto-encoders can be converted into a deep supervised feedforward neural network and fine-tuned by stochastic gradient descent. @@ -91,6 +91,8 @@ (but see~\citep{CollobertR2008}). In particular the {\em relative advantage} of deep learning for this settings has not been evaluated. +% TODO: why we care to evaluate this relative advantage + In this paper we ask the following questions: %\begin{enumerate} @@ -115,7 +117,7 @@ a corresponding shallow and purely supervised architecture? %\end{enumerate} -The experimental results presented here provide positive evidence towards all of these questions. +Our experimental results provide positive evidence towards all of these questions. \vspace*{-1mm} \section{Perturbation and Transformation of Character Images}