Mercurial > ift6266
changeset 482:ce69aa9204d8
changement au titre et reecriture abstract
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Mon, 31 May 2010 13:59:11 -0400 |
parents | 3e4290448eeb |
children | b9cdb464de5f |
files | writeup/nips2010_submission.tex |
diffstat | 1 files changed, 24 insertions(+), 29 deletions(-) [+] |
line wrap: on
line diff
--- a/writeup/nips2010_submission.tex Sun May 30 19:43:13 2010 -0400 +++ b/writeup/nips2010_submission.tex Mon May 31 13:59:11 2010 -0400 @@ -7,7 +7,7 @@ \usepackage{graphicx,subfigure} \usepackage[numbers]{natbib} -\title{Generating and Exploiting Perturbed and Multi-Task Handwritten Training Data for Deep Architectures} +\title{Deep Self-Taught Learning for Handwritten Character Recognition} \author{The IFT6266 Gang} \begin{document} @@ -16,30 +16,25 @@ \maketitle \begin{abstract} -Recent theoretical and empirical work in statistical machine learning has -demonstrated the importance of learning algorithms for deep -architectures, i.e., function classes obtained by composing multiple -non-linear transformations. In the area of handwriting recognition, -deep learning algorithms -had been evaluated on rather small datasets with a few tens of thousands -of examples. Here we propose a powerful generator of variations -of examples for character images based on a pipeline of stochastic -transformations that include not only the usual affine transformations -but also the addition of slant, local elastic deformations, changes -in thickness, background images, color, contrast, occlusion, and -various types of pixel and spatially correlated noise. -We evaluate a deep learning algorithm (Stacked Denoising Autoencoders) -on the task of learning to classify digits and letters transformed -with this pipeline, using the hundreds of millions of generated examples -and testing on the full 62-class NIST test set. -We find that the SDA outperforms its -shallow counterpart, an ordinary Multi-Layer Perceptron, -and that it is better able to take advantage of the additional -generated data, as well as better able to take advantage of -the multi-task setting, i.e., -training from more classes than those of interest in the end. -In fact, we find that the SDA reaches human performance as -estimated by the Amazon Mechanical Turk on the 62-class NIST test characters. + Recent theoretical and empirical work in statistical machine learning has + demonstrated the importance of learning algorithms for deep + architectures, i.e., function classes obtained by composing multiple + non-linear transformations. The self-taught learning (exploitng unlabeled + examples or examples from other distributions) has already been applied + to deep learners, but mostly to show the advantage of unlabeled + examples. Here we explore the advantage brought by {\em out-of-distribution + examples} and show that {\em deep learners benefit more from them than a + corresponding shallow learner}, in the area + of handwritten character recognition. In fact, we show that they reach + human-level performance on both handwritten digit classification and + 62-class handwritten character recognition. For this purpose we + developed a powerful generator of stochastic variations and noise + processes character images, including not only affine transformations but + also slant, local elastic deformations, changes in thickness, background + images, color, contrast, occlusion, and various types of pixel and + spatially correlated noise. The out-of-distribution examples are + obtained by training with these highly distorted images or + by including object classes different from those in the target test set. \end{abstract} \section{Introduction} @@ -279,11 +274,11 @@ \begin{figure}[h] \resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ -\caption{Illustration of each transformation applied to the same image -of the upper-case h (upper-left image). first row (from left to rigth) : original image, slant, +\caption{Illustration of each transformation applied alone to the same image +of an upper-case h (top left). First row (from left to rigth) : original image, slant, thickness, affine transformation, local elastic deformation; second row (from left to rigth) : -pinch, motion blur, occlusion, pixel permutation, gaussian noise; third row (from left to rigth) : -background image, salt and pepper noise, spatially gaussian noise, scratches, +pinch, motion blur, occlusion, pixel permutation, Gaussian noise; third row (from left to rigth) : +background image, salt and pepper noise, spatially Gaussian noise, scratches, color and contrast changes.} \label{fig:transfo} \end{figure}