# HG changeset patch # User Yoshua Bengio # Date 1275409685 14400 # Node ID 5927432d8b8d370a1dfed46876333f04680d73fb # Parent 8479bf822d0e101cc99e3e84cb94940eabcba474 - diff -r 8479bf822d0e -r 5927432d8b8d writeup/nips2010_submission.tex --- a/writeup/nips2010_submission.tex Tue Jun 01 12:13:10 2010 -0400 +++ b/writeup/nips2010_submission.tex Tue Jun 01 12:28:05 2010 -0400 @@ -53,7 +53,7 @@ those observed in the first two of these stages (in areas V1 and V2 of visual cortex)~\citep{HonglakL2008}, and that they become more and more invariant to factors of variation (such as camera movement) in -higher layers~\cite{Goodfellow2009}. +higher layers~\citep{Goodfellow2009}. Learning a hierarchy of features increases the ease and practicality of developing representations that are at once tailored to specific tasks, yet are able to borrow statistical strength @@ -133,6 +133,18 @@ from slant to pinch below, performs transformations. The second part, from blur to contrast, adds different kinds of noise. +\begin{figure}[h] +\resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ +\caption{Illustration of each transformation applied alone to the same image +of an upper-case h (top left). First row (from left to right) : original image, slant, +thickness, affine transformation (translation, rotation, shear), +local elastic deformation; second row (from left to right) : +pinch, motion blur, occlusion, pixel permutation, Gaussian noise; third row (from left to right) : +background image, salt and pepper noise, spatially Gaussian noise, scratches, +grey level and contrast changes.} +\label{fig:transfo} +\end{figure} + {\large\bf Transformations} \vspace*{2mm} @@ -298,18 +310,6 @@ \end{figure} \fi -\begin{figure}[h] -\resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ -\caption{Illustration of each transformation applied alone to the same image -of an upper-case h (top left). First row (from left to right) : original image, slant, -thickness, affine transformation (translation, rotation, shear), -local elastic deformation; second row (from left to right) : -pinch, motion blur, occlusion, pixel permutation, Gaussian noise; third row (from left to right) : -background image, salt and pepper noise, spatially Gaussian noise, scratches, -grey level and contrast changes.} -\label{fig:transfo} -\end{figure} - \vspace*{-1mm} \section{Experimental Setup} @@ -318,7 +318,7 @@ Whereas much previous work on deep learning algorithms had been performed on the MNIST digits classification task~\citep{Hinton06,ranzato-07,Bengio-nips-2006,Salakhutdinov+Hinton-2009}, with 60~000 examples, and variants involving 10~000 -examples~\cite{Larochelle-jmlr-toappear-2008,VincentPLarochelleH2008}, we want +examples~\citep{Larochelle-jmlr-toappear-2008,VincentPLarochelleH2008}, we want to focus here on the case of much larger training sets, from 10 times to to 1000 times larger. The larger datasets are obtained by first sampling from a {\em data source}: {\bf NIST} (NIST database 19), {\bf Fonts}, {\bf Captchas}, @@ -334,14 +334,14 @@ %\begin{itemize} %\item {\bf NIST.} -Our main source of characters is the NIST Special Database 19~\cite{Grother-1995}, +Our main source of characters is the NIST Special Database 19~\citep{Grother-1995}, widely used for training and testing character -recognition systems~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005}. -The dataset is composed with 8????? digits and characters (upper and lower cases), with hand checked classifications, +recognition systems~\citep{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005}. +The dataset is composed with 814255 digits and characters (upper and lower cases), with hand checked classifications, extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. The fourth series, $hsf_4$, experimentally recognized to be the most difficult one is recommended -by NIST as testing set and is used in our work and some previous work~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005} +by NIST as testing set and is used in our work and some previous work~\citep{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005} for that purpose. We randomly split the remainder into a training set and a validation set for model selection. The sizes of these data sets are: 651668 for training, 80000 for validation, and 82587 for testing. @@ -389,7 +389,7 @@ %\begin{itemize} %\item -{\bf NIST.} This is the raw NIST special database 19. +{\bf NIST.} This is the raw NIST special database 19~\citep{Grother-1995}. %\item {\bf P07.} This dataset is obtained by taking raw characters from all four of the above sources