comparison writeup/nips2010_submission.tex @ 479:6593e67381a3

Added transformation figure
author Xavier Glorot <glorotxa@iro.umontreal.ca>
date Sun, 30 May 2010 18:54:36 -0400
parents db28764b8252
children 150203d2b5c3
comparison
equal deleted inserted replaced
478:5a1a264aaad6 479:6593e67381a3
275 (bottom right) is used as training example.} 275 (bottom right) is used as training example.}
276 \label{fig:pipeline} 276 \label{fig:pipeline}
277 \end{figure} 277 \end{figure}
278 278
279 279
280 \begin{figure}[h]
281 \resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\
282 \caption{Illustration of each transformation applied to the same image
283 of the upper-case h (upper-left image). first row (from left to rigth) : original image, slant,
284 thickness, affine transformation, local elastic deformation; second row (from left to rigth) :
285 pinch, motion blur, occlusion, pixel permutation, gaussian noise; third row (from left to rigth) :
286 background image, salt and pepper noise, spatially gaussian noise, scratches,
287 color and contrast changes.}
288 \label{fig:transfo}
289 \end{figure}
290
291
292
280 \section{Experimental Setup} 293 \section{Experimental Setup}
281 294
282 Whereas much previous work on deep learning algorithms had been performed on 295 Whereas much previous work on deep learning algorithms had been performed on
283 the MNIST digits classification task~\citep{Hinton06,ranzato-07,Bengio-nips-2006,Salakhutdinov+Hinton-2009}, 296 the MNIST digits classification task~\citep{Hinton06,ranzato-07,Bengio-nips-2006,Salakhutdinov+Hinton-2009},
284 with 60~000 examples, and variants involving 10~000 297 with 60~000 examples, and variants involving 10~000
300 extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes 313 extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes
301 corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. 314 corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity.
302 The fourth series, $hsf_4$, experimentally recognized to be the most difficult one is recommended 315 The fourth series, $hsf_4$, experimentally recognized to be the most difficult one is recommended
303 by NIST as testing set and is used in our work and some previous work~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005} 316 by NIST as testing set and is used in our work and some previous work~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005}
304 for that purpose. We randomly split the remainder into a training set and a validation set for 317 for that purpose. We randomly split the remainder into a training set and a validation set for
305 model selection. The sizes of these data sets are: XXX for training, XXX for validation, 318 model selection. The sizes of these data sets are: for training, XXX for validation,
306 and XXX for testing. 319 and XXX for testing.
307 The performances reported by previous work on that dataset mostly use only the digits. 320 The performances reported by previous work on that dataset mostly use only the digits.
308 Here we use all the classes both in the training and testing phase. This is especially 321 Here we use all the classes both in the training and testing phase. This is especially
309 useful to estimate the effect of a multi-task setting. 322 useful to estimate the effect of a multi-task setting.
310 Note that the distribution of the classes in the NIST training and test sets differs 323 Note that the distribution of the classes in the NIST training and test sets differs
311 substantially, with relatively many more digits in the test set, and uniform distribution 324 substantially, with relatively many more digits in the test set, and uniform distribution
312 of letters in the test set, not in the training set (more like the natural distribution 325 of letters in the test set, not in the training set (more like the natural distribution
313 of letters in text). 326 of letters in text).
314 327
315 \item {\bf Fonts} TODO!!! 328 \item {\bf Fonts}
329 In order to have a good variety of sources we downloaded an important number of free fonts from: {\tt http://anonymous.url.net}
330 %real adress {\tt http://cg.scs.carleton.ca/~luc/freefonts.html}
331 in addition to Windows 7's, this adds up to a total of $9817$ different fonts that we can choose uniformly.
332 The ttf file is either used as input of the Captcha generator (see next item) or, by producing a corresponding image,
333 directly as input to our models.
334
335
316 336
317 \item {\bf Captchas} 337 \item {\bf Captchas}
318 The Captcha data source is an adaptation of the \emph{pycaptcha} library (a python based captcha generator library) for 338 The Captcha data source is an adaptation of the \emph{pycaptcha} library (a python based captcha generator library) for
319 generating characters of the same format as the NIST dataset. This software is based on 339 generating characters of the same format as the NIST dataset. This software is based on
320 a random character class generator and various kinds of tranformations similar to those described in the previous sections. 340 a random character class generator and various kinds of tranformations similar to those described in the previous sections.