Mercurial > ift6266
annotate writeup/techreport.tex @ 489:ee9836baade3
merge
author | dumitru@dumitru.mtv.corp.google.com |
---|---|
date | Mon, 31 May 2010 19:07:59 -0700 |
parents | 6593e67381a3 |
children | 8aad1c6ec39a |
rev | line source |
---|---|
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
1 \documentclass[12pt,letterpaper]{article} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
2 \usepackage[utf8]{inputenc} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
3 \usepackage{graphicx} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
4 \usepackage{times} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
5 \usepackage{mlapa} |
452
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
6 \usepackage{subfigure} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
7 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
8 \begin{document} |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
9 \title{Generating and Exploiting Perturbed and Multi-Task Handwritten Training Data for Deep Architectures} |
381 | 10 \author{The IFT6266 Gang} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
11 \date{April 2010, Technical Report, Dept. IRO, U. Montreal} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
12 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
13 \maketitle |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
14 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
15 \begin{abstract} |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
16 Recent theoretical and empirical work in statistical machine learning has |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
17 demonstrated the importance of learning algorithms for deep |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
18 architectures, i.e., function classes obtained by composing multiple |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
19 non-linear transformations. In the area of handwriting recognition, |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
20 deep learning algorithms |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
21 had been evaluated on rather small datasets with a few tens of thousands |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
22 of examples. Here we propose a powerful generator of variations |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
23 of examples for character images based on a pipeline of stochastic |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
24 transformations that include not only the usual affine transformations |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
25 but also the addition of slant, local elastic deformations, changes |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
26 in thickness, background images, color, contrast, occlusion, and |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
27 various types of pixel and spatially correlated noise. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
28 We evaluate a deep learning algorithm (Stacked Denoising Autoencoders) |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
29 on the task of learning to classify digits and letters transformed |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
30 with this pipeline, using the hundreds of millions of generated examples |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
31 and testing on the full 62-class NIST test set. |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
32 We find that the SDA outperforms its |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
33 shallow counterpart, an ordinary Multi-Layer Perceptron, |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
34 and that it is better able to take advantage of the additional |
438 | 35 generated data, as well as better able to take advantage of |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
36 the multi-task setting, i.e., |
438 | 37 training from more classes than those of interest in the end. |
38 In fact, we find that the SDA reaches human performance as | |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
39 estimated by the Amazon Mechanical Turk on the 62-class NIST test characters. |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
40 \end{abstract} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
41 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
42 \section{Introduction} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
43 |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
44 Deep Learning has emerged as a promising new area of research in |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
45 statistical machine learning (see~\emcite{Bengio-2009} for a review). |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
46 Learning algorithms for deep architectures are centered on the learning |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
47 of useful representations of data, which are better suited to the task at hand. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
48 This is in great part inspired by observations of the mammalian visual cortex, |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
49 which consists of a chain of processing elements, each of which is associated with a |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
50 different representation. In fact, |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
51 it was found recently that the features learnt in deep architectures resemble |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
52 those observed in the first two of these stages (in areas V1 and V2 |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
53 of visual cortex)~\cite{HonglakL2008}. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
54 Processing images typically involves transforming the raw pixel data into |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
55 new {\bf representations} that can be used for analysis or classification. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
56 For example, a principal component analysis representation linearly projects |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
57 the input image into a lower-dimensional feature space. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
58 Why learn a representation? Current practice in the computer vision |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
59 literature converts the raw pixels into a hand-crafted representation |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
60 (e.g.\ SIFT features~\cite{Lowe04}), but deep learning algorithms |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
61 tend to discover similar features in their first few |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
62 levels~\cite{HonglakL2008,ranzato-08,Koray-08,VincentPLarochelleH2008-very-small}. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
63 Learning increases the |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
64 ease and practicality of developing representations that are at once |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
65 tailored to specific tasks, yet are able to borrow statistical strength |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
66 from other related tasks (e.g., modeling different kinds of objects). Finally, learning the |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
67 feature representation can lead to higher-level (more abstract, more |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
68 general) features that are more robust to unanticipated sources of |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
69 variance extant in real data. |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
70 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
71 Whereas a deep architecture can in principle be more powerful than a |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
72 shallow one in terms of representation, depth appears to render the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
73 training problem more difficult in terms of optimization and local minima. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
74 It is also only recently that successful algorithms were proposed to |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
75 overcome some of these difficulties. All are based on unsupervised |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
76 learning, often in an greedy layer-wise ``unsupervised pre-training'' |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
77 stage~\cite{Bengio-2009}. One of these layer initialization techniques, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
78 applied here, is the Denoising |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
79 Auto-Encoder~(DEA)~\cite{VincentPLarochelleH2008-very-small}, which |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
80 performed similarly or better than previously proposed Restricted Boltzmann |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
81 Machines in terms of unsupervised extraction of a hierarchy of features |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
82 useful for classification. The principle is that each layer starting from |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
83 the bottom is trained to encode their input (the output of the previous |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
84 layer) and try to reconstruct it from a corrupted version of it. After this |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
85 unsupervised initialization, the stack of denoising auto-encoders can be |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
86 converted into a deep supervised feedforward neural network and trained by |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
87 stochastic gradient descent. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
88 |
407
fe2e2964e7a3
description des transformations en cours ajout d un fichier special.bib pour des references specifiques
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
393
diff
changeset
|
89 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
90 \section{Perturbation and Transformation of Character Images} |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
91 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
92 This section describes the different transformations we used to generate data, in their order. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
93 The code for these transformations (mostly python) is available at |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
94 {\tt http://anonymous.url.net}. All the modules in the pipeline share |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
95 a global control parameter ($0 \le complexity \le 1$) that allows one to modulate the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
96 amount of deformation or noise introduced. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
97 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
98 We can differentiate two important parts in the pipeline. The first one, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
99 from slant to pinch, performs transformations of the character. The second |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
100 part, from blur to contrast, adds noise to the image. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
101 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
102 \subsection{Slant} |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
103 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
104 In order to mimic a slant effect, we simply shift each row of the image |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
105 proportionnaly to its height: $shift = round(slant \times height)$. We |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
106 round the shift in order to have a discret displacement. We do not use a |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
107 filter to smooth the result in order to save computing time and also |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
108 because latter transformations have similar effects. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
109 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
110 The $slant$ coefficient can be negative or positive with equal probability |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
111 and its value is randomly sampled according to the complexity level. In |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
112 our case we take uniformly a number in the range $[0,complexity]$, so the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
113 maximum displacement for the lowest or highest pixel line is of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
114 $round(complexity \times 32)$. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
115 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
116 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
117 \subsection{Thickness} |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
118 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
119 To change the thickness of the characters we used morpholigical operators: |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
120 dilation and erosion~\cite{Haralick87,Serra82}. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
121 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
122 The basic idea of such transform is, for each pixel, to multiply in the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
123 element-wise manner its neighbourhood with a matrix called the structuring |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
124 element. Then for dilation we remplace the pixel value by the maximum of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
125 the result, or the minimum for erosion. This will dilate or erode objects |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
126 in the image and the strength of the transform only depends on the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
127 structuring element. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
128 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
129 We used ten different structural elements with increasing dimensions (the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
130 biggest is $5\times5$). for each image, we radomly sample the operator |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
131 type (dilation or erosion) with equal probability and one structural |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
132 element from a subset of the $n$ smallest structuring elements where $n$ is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
133 $round(10 \times complexity)$ for dilation and $round(6 \times complexity)$ |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
134 for erosion. A neutral element is always present in the set, if it is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
135 chosen the transformation is not applied. Erosion allows only the six |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
136 smallest structural elements because when the character is too thin it may |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
137 erase it completly. |
407
fe2e2964e7a3
description des transformations en cours ajout d un fichier special.bib pour des references specifiques
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
393
diff
changeset
|
138 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
139 \subsection{Affine Transformations} |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
140 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
141 We generate an affine transform matrix according to the complexity level, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
142 then we apply it directly to the image. The matrix is of size $2 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
143 3$, so we can represent it by six parameters $(a,b,c,d,e,f)$. Formally, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
144 for each pixel $(x,y)$ of the output image, we give the value of the pixel |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
145 nearest to : $(ax+by+c,dx+ey+f)$, in the input image. This allows to |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
146 produce scaling, translation, rotation and shearing variances. |
431
bfa349f567e8
correction in the transformation descripition
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
428
diff
changeset
|
147 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
148 The sampling of the parameters $(a,b,c,d,e,f)$ have been tuned by hand to |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
149 forbid important rotations (not to confuse classes) but to give good |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
150 variability of the transformation. For each image we sample uniformly the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
151 parameters in the following ranges: $a$ and $d$ in $[1-3 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
152 complexity,1+3 \times complexity]$, $b$ and $e$ in $[-3 \times complexity,3 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
153 \times complexity]$ and $c$ and $f$ in $[-4 \times complexity, 4 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
154 complexity]$. |
431
bfa349f567e8
correction in the transformation descripition
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
428
diff
changeset
|
155 |
407
fe2e2964e7a3
description des transformations en cours ajout d un fichier special.bib pour des references specifiques
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
393
diff
changeset
|
156 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
157 \subsection{Local Elastic Deformations} |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
158 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
159 This filter induces a "wiggly" effect in the image. The description here |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
160 will be brief, as the algorithm follows precisely what is described in |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
161 \cite{SimardSP03}. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
162 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
163 The general idea is to generate two "displacements" fields, for horizontal |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
164 and vertical displacements of pixels. Each of these fields has the same |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
165 size as the original image. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
166 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
167 When generating the transformed image, we'll loop over the x and y |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
168 positions in the fields and select, as a value, the value of the pixel in |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
169 the original image at the (relative) position given by the displacement |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
170 fields for this x and y. If the position we'd retrieve is outside the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
171 borders of the image, we use a 0 value instead. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
172 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
173 To generate a pixel in either field, first a value between -1 and 1 is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
174 chosen from a uniform distribution. Then all the pixels, in both fields, is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
175 multiplied by a constant $\alpha$ which controls the intensity of the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
176 displacements (bigger $\alpha$ translates into larger wiggles). |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
177 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
178 As a final step, each field is convoluted with a Gaussian 2D kernel of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
179 standard deviation $\sigma$. Visually, this results in a "blur" |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
180 filter. This has the effect of making values next to each other in the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
181 displacement fields similar. In effect, this makes the wiggles more |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
182 coherent, less noisy. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
183 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
184 As displacement fields were long to compute, 50 pairs of fields were |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
185 generated per complexity in increments of 0.1 (50 pairs for 0.1, 50 pairs |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
186 for 0.2, etc.), and afterwards, given a complexity, we selected randomly |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
187 among the 50 corresponding pairs. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
188 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
189 $\sigma$ and $\alpha$ were linked to complexity through the formulas |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
190 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
191 \sqrt[3]{complexity}$. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
192 |
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
193 |
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
194 \subsection{Pinch} |
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
195 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
196 This is another GIMP filter we used. The filter is in fact named "Whirl and |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
197 pinch", but we don't use the "whirl" part (whirl is set to 0). As described |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
198 in GIMP, a pinch is "similar to projecting the image onto an elastic |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
199 surface and pressing or pulling on the center of the surface". |
416
5f9d04dda707
Correction d'une erreur pour pinch et ajout d'une ref bibliographique
fsavard
parents:
415
diff
changeset
|
200 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
201 Mathematically, for a square input image, think of drawing a circle of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
202 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
203 that disk (region inside circle) will have its value recalculated by taking |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
204 the value of another "source" pixel in the original image. The position of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
205 that source pixel is found on the line thats goes through $C$ and $P$, but |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
206 at some other distance $d_2$. Define $d_1$ to be the distance between $P$ |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
207 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
208 d_1$, where $pinch$ is a parameter to the filter. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
209 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
210 If the region considered is not square then, before computing $d_2$, the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
211 smallest dimension (x or y) is stretched such that we may consider the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
212 region as if it was square. Then, after $d_2$ has been computed and |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
213 corresponding components $d_2\_x$ and $d_2\_y$ have been found, the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
214 component corresponding to the stretched dimension is compressed back by an |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
215 inverse ratio. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
216 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
217 The actual value is given by bilinear interpolation considering the pixels |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
218 around the (non-integer) source position thus found. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
219 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
220 The value for $pinch$ in our case was given by sampling from an uniform |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
221 distribution over the range $[-complexity, 0.7 \times complexity]$. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
222 |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
223 \subsection{Motion Blur} |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
224 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
225 This is a GIMP filter we applied, a "linear motion blur" in GIMP |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
226 terminology. The description will be brief as it is a well-known filter. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
227 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
228 This algorithm has two input parameters, $length$ and $angle$. The value of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
229 a pixel in the final image is the mean value of the $length$ first pixels |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
230 found by moving in the $angle$ direction. An approximation of this idea is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
231 used, as we won't fall onto precise pixels by following that |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
232 direction. This is done using the Bresenham line algorithm. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
233 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
234 The angle, in our case, is chosen from a uniform distribution over |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
235 $[0,360]$ degrees. The length, though, depends on the complexity; it's |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
236 sampled from a Gaussian distribution of mean 0 and standard deviation |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
237 $\sigma = 3 \times complexity$. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
238 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
239 \subsection{Occlusion} |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
240 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
241 This filter selects random parts of other (hereafter "occlusive") letter |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
242 images and places them over the original letter (hereafter "occluded") |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
243 image. To be more precise, having selected a subregion of the occlusive |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
244 image and a desination position in the occluded image, to determine the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
245 final value for a given overlapping pixel, it selects whichever pixel is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
246 the lightest. As a reminder, the background value is 0, black, so the value |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
247 nearest to 1 is selected. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
248 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
249 To select a subpart of the occlusive image, four numbers are generated. For |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
250 compability with the code, we'll call them "haut", "bas", "gauche" and |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
251 "droite" (respectively meaning top, bottom, left and right). Each of these |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
252 numbers is selected according to a Gaussian distribution of mean $8 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
253 complexity$ and standard deviation $2$. This means the largest the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
254 complexity is, the biggest the occlusion will be. The absolute value is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
255 taken, as the numbers must be positive, and the maximum value is capped at |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
256 15. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
257 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
258 These four sizes collectively define a window centered on the middle pixel |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
259 of the occlusive image. This is the part that will be extracted as the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
260 occlusion. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
261 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
262 The next step is to select a destination position in the occluded |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
263 image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$ |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
264 are selected according to Gaussian distributions of mean 0 and of standard |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
265 deviations of, respectively, 3 and 2. Then an horizontal placement mode, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
266 $place$, is selected to be of three values meaning |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
267 left, middle or right. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
268 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
269 If $place$ is "middle", the occlusion will be horizontally centered |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
270 around the horizontal middle of the occluded image, then shifted according |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
271 to $x\_arrivee$. If $place$ is "left", it will be placed on the left of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
272 the occluded image, then displaced right according to $x\_arrivee$. The |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
273 contrary happens if $place$ is $right$. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
274 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
275 In both the horizontal and vertical positionning, the maximum position in |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
276 either direction is such that the selected occlusion won't go beyond the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
277 borders of the occluded image. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
278 |
416
5f9d04dda707
Correction d'une erreur pour pinch et ajout d'une ref bibliographique
fsavard
parents:
415
diff
changeset
|
279 This filter has a probability of not being applied, at all, of 60\%. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
280 |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
281 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
282 \subsection{Pixel Permutation} |
442
d5b2b6397a5a
added permut pixel
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
431
diff
changeset
|
283 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
284 This filter permuts neighbouring pixels. It selects first |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
285 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
286 sequentially exchanged to one other pixel in its $V4$ neighbourhood. Number |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
287 of exchanges to the left, right, top, bottom are equal or does not differ |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
288 from more than 1 if the number of selected pixels is not a multiple of 4. |
442
d5b2b6397a5a
added permut pixel
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
431
diff
changeset
|
289 |
d5b2b6397a5a
added permut pixel
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
431
diff
changeset
|
290 It has has a probability of not being applied, at all, of 80\%. |
d5b2b6397a5a
added permut pixel
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
431
diff
changeset
|
291 |
d5b2b6397a5a
added permut pixel
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
431
diff
changeset
|
292 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
293 \subsection{Gaussian Noise} |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
294 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
295 This filter simply adds, to each pixel of the image independently, a |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
296 Gaussian noise of mean $0$ and standard deviation $\frac{complexity}{10}$. |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
297 |
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
298 It has has a probability of not being applied, at all, of 70\%. |
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
299 |
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
300 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
301 \subsection{Background Images} |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
302 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
303 Following~\cite{Larochelle-jmlr-2009}, this transformation adds a random |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
304 background behind the letter. The background is chosen by first selecting, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
305 at random, an image from a set of images. Then we choose a 32x32 subregion |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
306 of that image as the background image (by sampling x and y positions |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
307 uniformly while making sure not to cross image borders). |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
308 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
309 To combine the original letter image and the background image, contrast |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
310 adjustments are made. We first get the maximal values (i.e. maximal |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
311 intensity) for both the original image and the background image, $maximage$ |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
312 and $maxbg$. We also have a parameter, $contrast$, given by sampling from a |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
313 uniform distribution over $[complexity, 1]$. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
314 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
315 Once we have all these numbers, we first adjust the values for the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
316 background image. Each pixel value is multiplied by $\frac{max(maximage - |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
317 contrast, 0)}{maxbg}$. Therefore the higher the contrast, the darkest the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
318 background will be. |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
319 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
320 The final image is found by taking the brightest (i.e. value nearest to 1) |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
321 pixel from either the background image or the corresponding pixel in the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
322 original image. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
323 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
324 \subsection{Salt and Pepper Noise} |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
325 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
326 This filter adds noise to the image by randomly selecting a certain number |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
327 of them and, for those selected pixels, assign a random value according to |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
328 a uniform distribution over the $[0,1]$ ranges. This last distribution does |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
329 not change according to complexity. Instead, the number of selected pixels |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
330 does: the proportion of changed pixels corresponds to $complexity / 5$, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
331 which means, as a maximum, 20\% of the pixels will be randomized. On the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
332 lowest extreme, no pixel is changed. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
333 |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
334 This filter also has a probability of not being applied, at all, of 75\%. |
415
1e9788ce1680
Added the parts concerning the transformations I'd announced I'd do: Local elastic deformations; occlusions; gimp transformations; salt and pepper noise; background images
fsavard
parents:
411
diff
changeset
|
335 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
336 \subsection{Spatially Gaussian Noise} |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
337 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
338 The aim of this transformation is to filter, with a gaussian kernel, |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
339 different regions of the image. In order to save computing time we decided |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
340 to convolve the whole image only once with a symmetric gaussian kernel of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
341 size and variance choosen uniformly in the ranges: $[12,12 + 20 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
342 complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
343 between $0$ and $1$. We also create a symmetric averaging window, of the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
344 kernel size, with maximum value at the center. For each image we sample |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
345 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
346 averaging centers between the original image and the filtered one. We |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
347 initialize to zero a mask matrix of the image size. For each selected pixel |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
348 we add to the mask the averaging window centered to it. The final image is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
349 computed from the following element-wise operation: $\frac{image + filtered |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
350 image \times mask}{mask+1}$. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
351 |
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
352 This filter has a probability of not being applied, at all, of 75\%. |
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
353 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
354 \subsection{Scratches} |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
355 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
356 The scratches module places line-like white patches on the image. The |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
357 lines are in fact heavily transformed images of the digit "1" (one), chosen |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
358 at random among five thousands such start images of this digit. |
428 | 359 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
360 Once the image is selected, the transformation begins by finding the first |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
361 $top$, $bottom$, $right$ and $left$ non-zero pixels in the image. It is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
362 then cropped to the region thus delimited, then this cropped version is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
363 expanded to $32\times32$ again. It is then rotated by a random angle having a |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
364 Gaussian distribution of mean 90 and standard deviation $100 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
365 complexity$ (in degrees). The rotation is done with bicubic interpolation. |
428 | 366 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
367 The rotated image is then resized to $50\times50$, with anti-aliasing. In |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
368 that image, we crop the image again by selecting a region delimited |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
369 horizontally to $left$ to $left+32$ and vertically by $top$ to $top+32$. |
428 | 370 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
371 Once this is done, two passes of a greyscale morphological erosion filter |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
372 are applied. Put briefly, this erosion filter reduces the width of the line |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
373 by a certain $smoothing$ amount. For small complexities (< 0.5), |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
374 $smoothing$ is 6, so the line is very small. For complexities ranging from |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
375 0.25 to 0.5, $smoothing$ is 5. It is 4 for complexities 0.5 to 0.75, and 3 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
376 for higher complexities. |
428 | 377 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
378 To compensate for border effects, the image is then cropped to 28x28 by |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
379 removing two pixels everywhere on the borders, then expanded to 32x32 |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
380 again. The pixel values are then linearly expanded such that the minimum |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
381 value is 0 and the maximal one is 1. Then, 50\% of the time, the image is |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
382 vertically flipped. |
428 | 383 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
384 This filter is only applied only 15\% of the time. When it is applied, 50\% |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
385 of the time, only one patch image is generated and applied. In 30\% of |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
386 cases, two patches are generated, and otherwise three patches are |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
387 generated. The patch is applied by taking the maximal value on any given |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
388 patch or the original image, for each of the 32x32 pixel locations. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
389 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
390 \subsection{Color and Contrast Changes} |
426
a7fab59de174
change order of transformations
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
425
diff
changeset
|
391 |
462
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
392 This filter changes the constrast and may invert the image polarity (white |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
393 on black to black on white). The contrast $C$ is defined here as the |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
394 difference between the maximum and the minimum pixel value of the image. A |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
395 contrast value is sampled uniformly between $1$ and $1-0.85 \times |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
396 complexity$ (this insure a minimum constrast of $0.15$). We then simply |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
397 normalize the image to the range $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The |
f59af1648d83
cleaner le techreport
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
461
diff
changeset
|
398 polarity is inverted with $0.5$ probability. |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
399 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
400 |
393
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
401 \begin{figure}[h] |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
402 \resizebox{.99\textwidth}{!}{\includegraphics{images/example_t.png}}\\ |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
403 \caption{Illustration of the pipeline of stochastic |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
404 transformations applied to the image of a lower-case t |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
405 (the upper left image). Each image in the pipeline (going from |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
406 left to right, first top line, then bottom line) shows the result |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
407 of applying one of the modules in the pipeline. The last image |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
408 (bottom right) is used as training example.} |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
409 \label{fig:pipeline} |
4c840798d290
added examples of figure and table of results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
392
diff
changeset
|
410 \end{figure} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
411 |
422
e7790db265b1
Basic text for section 3, add a bit more detail to section 4.2.2
Arnaud Bergeron <abergeron@gmail.com>
parents:
417
diff
changeset
|
412 |
479
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
413 \begin{figure}[h] |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
414 \resizebox{.99\textwidth}{!}{\includegraphics{images/transfo.png}}\\ |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
415 \caption{Illustration of each transformation applied to the same image |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
416 of the upper-case h (upper-left image). first row (from left to rigth) : original image, slant, |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
417 thickness, affine transformation, local elastic deformation; second row (from left to rigth) : |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
418 pinch, motion blur, occlusion, pixel permutation, gaussian noise; third row (from left to rigth) : |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
419 background image, salt and pepper noise, spatially gaussian noise, scratches, |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
420 color and contrast changes.} |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
421 \label{fig:transfo} |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
422 \end{figure} |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
423 |
6593e67381a3
Added transformation figure
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
477
diff
changeset
|
424 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
425 \section{Experimental Setup} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
426 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
427 \subsection{Training Datasets} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
428 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
429 \subsubsection{Data Sources} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
430 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
431 \begin{itemize} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
432 \item {\bf NIST} |
434
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
433 The NIST Special Database 19 (NIST19) is a very widely used dataset for training and testing OCR systems. |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
434 The dataset is composed with over 800 000 digits and characters (upper and lower cases), with hand checked classifications, |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
435 extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
436 corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
437 The fourth series, $hsf_4$, experimentally recognized to be the most difficult one for classification task is recommended |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
438 by NIST as testing set and is used in our work for that purpose. |
310c730516af
added description of nist19 and captcha data sources
goldfinger
parents:
432
diff
changeset
|
439 The performances reported by previous work on that dataset mostly use only the digits. |
432
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
440 Here we use the whole classes both in the training and testing phase. |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
441 |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
442 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
443 \item {\bf Fonts} |
477
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
444 In order to have a good variety of sources we downloaded an important number of free fonts from: {\tt http://anonymous.url.net} |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
445 %real adress {\tt http://cg.scs.carleton.ca/~luc/freefonts.html} |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
446 in addition to Windows 7's, this adds up to a total of $9817$ different fonts that we can choose uniformly. |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
447 The ttf file is either used as input of the Captcha generator (see next item) or, by producing a corresponding image, |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
448 directly as input to our models. |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
449 %Guillaume are there other details I forgot on the font selection? |
534d4ecf1bd1
small desription of the font added
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
463
diff
changeset
|
450 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
451 \item {\bf Captchas} |
432
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
452 The Captcha data source is an adaptation of the \emph{pycaptcha} library (a python based captcha generator library) for |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
453 generating characters of the same format as the NIST dataset. The core of this data source is composed with a random character |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
454 generator and various kinds of tranformations similar to those described in the previous sections. |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
455 In order to increase the variability of the data generated, different fonts are used for generating the characters. |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
456 Transformations (slant, distorsions, rotation, translation) are applied to each randomly generated character with a complexity |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
457 depending on the value of the complexity parameter provided by the user of the data source. Two levels of complexity are |
e2fd928a7de0
added description of nist19 and captcha data sources
goldfinger
parents:
428
diff
changeset
|
458 allowed and can be controlled via an easy to use facade class. |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
459 \item {\bf OCR data} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
460 \end{itemize} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
461 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
462 \subsubsection{Data Sets} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
463 \begin{itemize} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
464 \item {\bf P07} |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
465 The dataset P07 is sampled with our transformation pipeline with a complexity parameter of $0.7$. |
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
466 For each new exemple to generate, we choose one source with the following probability: $0.1$ for the fonts, |
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
467 $0.25$ for the captchas, $0.25$ for OCR data and $0.4$ for NIST. We apply all the transformations in their order |
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
468 and for each of them we sample uniformly a complexity in the range $[0,0.7]$. |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
469 \item {\bf NISTP} {\em ne pas utiliser PNIST mais NISTP, pour rester politically correct...} |
463
5fa1c653620c
added small information on NISTP
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
462
diff
changeset
|
470 NISTP is equivalent to P07 (complexity parameter of $0.7$ with the same sources proportion) except that we only apply transformations from slant to pinch. Therefore, the character is transformed |
420
a3a4a9c6476d
added transformations description and began dataset descriptions
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
417
diff
changeset
|
471 but no additionnal noise is added to the image, this gives images closer to the NIST dataset. |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
472 \end{itemize} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
473 |
452
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
474 We noticed that the distribution of the training sets and the test sets differ. |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
475 Since our validation sets are sampled from the training set, they have approximately the same distribution, but the test set has a completely different distribution as illustrated in figure \ref {setsdata}. |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
476 |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
477 \begin{figure} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
478 \subfigure[NIST training]{\includegraphics[width=0.5\textwidth]{images/nisttrainstats}} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
479 \subfigure[NIST validation]{\includegraphics[width=0.5\textwidth]{images/nistvalidstats}} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
480 \subfigure[NIST test]{\includegraphics[width=0.5\textwidth]{images/nistteststats}} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
481 \subfigure[NISTP validation]{\includegraphics[width=0.5\textwidth]{images/nistpvalidstats}} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
482 \caption{Proportion of each class in some of the data sets} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
483 \label{setsdata} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
484 \end{figure} |
b0622f78cfec
Add a small paragraph mentionning the distribution differences and a figure illustrating the difference.
Arnaud Bergeron <abergeron@gmail.com>
parents:
444
diff
changeset
|
485 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
486 \subsection{Models and their Hyperparameters} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
487 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
488 \subsubsection{Multi-Layer Perceptrons (MLP)} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
489 |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
490 An MLP is a family of functions that are described by stacking layers of of a function similar to |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
491 $$g(x) = \tanh(b+Wx)$$ |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
492 The input, $x$, is a $d$-dimension vector. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
493 The output, $g(x)$, is a $m$-dimension vector. |
411
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
494 The parameter $W$ is a $m\times d$ matrix and is called the weight matrix. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
495 The parameter $b$ is a $m$-vector and is called the bias vector. |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
496 The non-linearity (here $\tanh$) is applied element-wise to the output vector. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
497 Usually the input is referred to a input layer and similarly for the output. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
498 You can of course chain several such functions to obtain a more complex one. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
499 Here is a common example |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
500 $$f(x) = c + V\tanh(b+Wx)$$ |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
501 In this case the intermediate layer corresponding to $\tanh(b+Wx)$ is called a hidden layer. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
502 Here the output layer does not have the same non-linearity as the hidden layer. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
503 This is a common case where some specialized non-linearity is applied to the output layer only depending on the task at hand. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
504 |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
505 If you put 3 or more hidden layers in such a network you obtain what is called a deep MLP. |
411
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
506 The parameters to adapt are the weight matrix and the bias vector for each layer. |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
507 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
508 \subsubsection{Stacked Denoising Auto-Encoders (SDAE)} |
422
e7790db265b1
Basic text for section 3, add a bit more detail to section 4.2.2
Arnaud Bergeron <abergeron@gmail.com>
parents:
417
diff
changeset
|
509 \label{SdA} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
510 |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
511 Auto-encoders are essentially a way to initialize the weights of the network to enable better generalization. |
422
e7790db265b1
Basic text for section 3, add a bit more detail to section 4.2.2
Arnaud Bergeron <abergeron@gmail.com>
parents:
417
diff
changeset
|
512 This is essentially unsupervised training where the layer is made to reconstruct its input through and encoding and decoding phase. |
e7790db265b1
Basic text for section 3, add a bit more detail to section 4.2.2
Arnaud Bergeron <abergeron@gmail.com>
parents:
417
diff
changeset
|
513 Denoising auto-encoders are a variant where the input is corrupted with random noise but the target is the uncorrupted input. |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
514 The principle behind these initialization methods is that the network will learn the inherent relation between portions of the data and be able to represent them thus helping with whatever task we want to perform. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
515 |
411
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
516 An auto-encoder unit is formed of two MLP layers with the bottom one called the encoding layer and the top one the decoding layer. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
517 Usually the top and bottom weight matrices are the transpose of each other and are fixed this way. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
518 The network is trained as such and, when sufficiently trained, the MLP layer is initialized with the parameters of the encoding layer. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
519 The other parameters are discarded. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
520 |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
521 The stacked version is an adaptation to deep MLPs where you initialize each layer with a denoising auto-encoder starting from the bottom. |
411
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
522 During the initialization, which is usually called pre-training, the bottom layer is treated as if it were an isolated auto-encoder. |
4f69d915d142
Better description of the model parameters.
Arnaud Bergeron <abergeron@gmail.com>
parents:
410
diff
changeset
|
523 The second and following layers receive the same treatment except that they take as input the encoded version of the data that has gone through the layers before it. |
410
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
524 For additional details see \cite{vincent:icml08}. |
6330298791fb
Description brève de MLP et SdA
Arnaud Bergeron <abergeron@gmail.com>
parents:
407
diff
changeset
|
525 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
526 \section{Experimental Results} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
527 |
438 | 528 \subsection{SDA vs MLP vs Humans} |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
529 |
438 | 530 We compare here the best MLP (according to validation set error) that we found against |
531 the best SDA (again according to validation set error), along with a precise estimate | |
532 of human performance obtained via Amazon's Mechanical Turk (AMT) | |
533 service\footnote{http://mturk.com}. AMT users are paid small amounts | |
534 of money to perform tasks for which human intelligence is required. | |
535 Mechanical Turk has been used extensively in natural language | |
536 processing \cite{SnowEtAl2008} and vision | |
537 \cite{SorokinAndForsyth2008,whitehill09}. AMT users where presented | |
538 with 10 character images and asked to type 10 corresponding ascii | |
539 characters. Hence they were forced to make a hard choice among the | |
540 62 character classes. Three users classified each image, allowing | |
541 to estimate inter-human variability (shown as +/- in parenthesis below). | |
542 | |
543 \begin{table} | |
458
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
544 \caption{Overall comparison of error rates ($\pm$ std.err.) on 62 character classes (10 digits + |
438 | 545 26 lower + 26 upper), except for last columns -- digits only, between deep architecture with pre-training |
546 (SDA=Stacked Denoising Autoencoder) and ordinary shallow architecture | |
458
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
547 (MLP=Multi-Layer Perceptron). The models shown are all trained using perturbed data (NISTP or P07) |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
548 and using a validation set to select hyper-parameters and other training choices. |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
549 \{SDA,MLP\}0 are trained on NIST, |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
550 \{SDA,MLP\}1 are trained on NISTP, and \{SDA,MLP\}2 are trained on P07. |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
551 The human error rate on digits is a lower bound because it does not count digits that were |
461 | 552 recognized as letters. For comparison, the results found in the literature |
553 on NIST digits classification using the same test set are included.} | |
438 | 554 \label{tab:sda-vs-mlp-vs-humans} |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
555 \begin{center} |
438 | 556 \begin{tabular}{|l|r|r|r|r|} \hline |
458
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
557 & NIST test & NISTP test & P07 test & NIST test digits \\ \hline |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
558 Humans& 18.2\% $\pm$.1\% & 39.4\%$\pm$.1\% & 46.9\%$\pm$.1\% & $>1.1\%$ \\ \hline |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
559 SDA0 & 23.7\% $\pm$.14\% & 65.2\%$\pm$.34\% & 97.45\%$\pm$.06\% & 2.7\% $\pm$.14\%\\ \hline |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
560 SDA1 & 17.1\% $\pm$.13\% & 29.7\%$\pm$.3\% & 29.7\%$\pm$.3\% & 1.4\% $\pm$.1\%\\ \hline |
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
561 SDA2 & 18.7\% $\pm$.13\% & 33.6\%$\pm$.3\% & 39.9\%$\pm$.17\% & 1.7\% $\pm$.1\%\\ \hline |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
562 MLP0 & 24.2\% $\pm$.15\% & 68.8\%$\pm$.33\% & 78.70\%$\pm$.14\% & 3.45\% $\pm$.15\% \\ \hline |
458
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
563 MLP1 & 23.0\% $\pm$.15\% & 41.8\%$\pm$.35\% & 90.4\%$\pm$.1\% & 3.85\% $\pm$.16\% \\ \hline |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
564 MLP2 & 24.3\% $\pm$.15\% & 46.0\%$\pm$.35\% & 54.7\%$\pm$.17\% & 4.85\% $\pm$.18\% \\ \hline |
461 | 565 [5] & & & & 4.95\% $\pm$.18\% \\ \hline |
566 [2] & & & & 3.71\% $\pm$.16\% \\ \hline | |
567 [3] & & & & 2.4\% $\pm$.13\% \\ \hline | |
568 [4] & & & & 2.1\% $\pm$.12\% \\ \hline | |
392
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
569 \end{tabular} |
5f8fffd7347f
possible image for illustrating perturbations
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
381
diff
changeset
|
570 \end{center} |
438 | 571 \end{table} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
572 |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
573 \subsection{Perturbed Training Data More Helpful for SDAE} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
574 |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
575 \begin{table} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
576 \caption{Relative change in error rates due to the use of perturbed training data, |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
577 either using NISTP, for the MLP1/SDA1 models, or using P07, for the MLP2/SDA2 models. |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
578 A positive value indicates that training on the perturbed data helped for the |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
579 given test set (the first 3 columns on the 62-class tasks and the last one is |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
580 on the clean 10-class digits). Clearly, the deep learning models did benefit more |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
581 from perturbed training data, even when testing on clean data, whereas the MLP |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
582 trained on perturbed data performed worse on the clean digits and about the same |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
583 on the clean characters. } |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
584 \label{tab:sda-vs-mlp-vs-humans} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
585 \begin{center} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
586 \begin{tabular}{|l|r|r|r|r|} \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
587 & NIST test & NISTP test & P07 test & NIST test digits \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
588 SDA0/SDA1-1 & 38\% & 84\% & 228\% & 93\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
589 SDA0/SDA2-1 & 27\% & 94\% & 144\% & 59\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
590 MLP0/MLP1-1 & 5.2\% & 65\% & -13\% & -10\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
591 MLP0/MLP2-1 & -0.4\% & 49\% & 44\% & -29\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
592 \end{tabular} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
593 \end{center} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
594 \end{table} |
458
c0f738f0cef0
added many results
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
452
diff
changeset
|
595 |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
596 |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
597 \subsection{Multi-Task Learning Effects} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
598 |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
599 As previously seen, the SDA is better able to benefit from the |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
600 transformations applied to the data than the MLP. In this experiment we |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
601 define three tasks: recognizing digits (knowing that the input is a digit), |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
602 recognizing upper case characters (knowing that the input is one), and |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
603 recognizing lower case characters (knowing that the input is one). We |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
604 consider the digit classification task as the target task and we want to |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
605 evaluate whether training with the other tasks can help or hurt, and |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
606 whether the effect is different for MLPs versus SDAs. The goal is to find |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
607 out if deep learning can benefit more (or less) from multiple related tasks |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
608 (i.e. the multi-task setting) compared to a corresponding purely supervised |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
609 shallow learner. |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
610 |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
611 We use a single hidden layer MLP with 1000 hidden units, and a SDA |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
612 with 3 hidden layers (1000 hidden units per layer), pre-trained and |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
613 fine-tuned on NIST. |
437
479f2f518fc9
added Training with More Classes than Necessary
Guillaume Sicard <guitch21@gmail.com>
parents:
434
diff
changeset
|
614 |
460
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
615 Our results show that the MLP benefits marginally from the multi-task setting |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
616 in the case of digits (5\% relative improvement) but is actually hurt in the case |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
617 of characters (respectively 3\% and 4\% worse for lower and upper class characters). |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
618 On the other hand the SDA benefitted from the multi-task setting, with relative |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
619 error rate improvements of 27\%, 15\% and 13\% respectively for digits, |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
620 lower and upper case characters, as shown in Table~\ref{tab:multi-task}. |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
621 |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
622 \begin{table} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
623 \caption{Test error rates and relative change in error rates due to the use of |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
624 a multi-task setting, i.e., training on each task in isolation vs training |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
625 for all three tasks together, for MLPs vs SDAs. The SDA benefits much |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
626 more from the multi-task setting. All experiments on only on the |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
627 unperturbed NIST data, using validation error for model selection. |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
628 Relative improvement is 1 - single-task error / multi-task error.} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
629 \label{tab:multi-task} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
630 \begin{center} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
631 \begin{tabular}{|l|r|r|r|} \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
632 & single-task & multi-task & relative \\ |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
633 & setting & setting & improvement \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
634 MLP-digits & 3.77\% & 3.99\% & 5.6\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
635 MLP-lower & 17.4\% & 16.8\% & -4.1\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
636 MLP-upper & 7.84\% & 7.54\% & -3.6\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
637 SDA-digits & 2.6\% & 3.56\% & 27\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
638 SDA-lower & 12.3\% & 14.4\% & 15\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
639 SDA-upper & 5.93\% & 6.78\% & 13\% \\ \hline |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
640 \end{tabular} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
641 \end{center} |
fe292653a0f8
ajoute dernier tableau de resultats
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
458
diff
changeset
|
642 \end{table} |
437
479f2f518fc9
added Training with More Classes than Necessary
Guillaume Sicard <guitch21@gmail.com>
parents:
434
diff
changeset
|
643 |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
644 \section{Conclusions} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
645 |
407
fe2e2964e7a3
description des transformations en cours ajout d un fichier special.bib pour des references specifiques
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
393
diff
changeset
|
646 \bibliography{strings,ml,aigaion,specials} |
379
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
647 \bibliographystyle{mlapa} |
a21a174c1c18
added writeup skeleton
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
648 |
407
fe2e2964e7a3
description des transformations en cours ajout d un fichier special.bib pour des references specifiques
Xavier Glorot <glorotxa@iro.umontreal.ca>
parents:
393
diff
changeset
|
649 \end{document} |