comparison writeup/nips2010_submission.tex @ 467:e0e57270b2af

refs
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Sat, 29 May 2010 16:50:03 -0400
parents 6205481bf33f
children d02d288257bf
comparison
equal deleted inserted replaced
466:6205481bf33f 467:e0e57270b2af
106 \end{enumerate} 106 \end{enumerate}
107 The experimental results presented here provide positive evidence towards all of these questions. 107 The experimental results presented here provide positive evidence towards all of these questions.
108 108
109 \section{Perturbation and Transformation of Character Images} 109 \section{Perturbation and Transformation of Character Images}
110 110
111 This section describes the different transformations we used to generate data, in their order. 111 This section describes the different transformations we used to stochastically
112 transform source images in order to obtain data. More details can
113 be found in this technical report~\cite{ift6266-tr-anonymous}.
112 The code for these transformations (mostly python) is available at 114 The code for these transformations (mostly python) is available at
113 {\tt http://anonymous.url.net}. All the modules in the pipeline share 115 {\tt http://anonymous.url.net}. All the modules in the pipeline share
114 a global control parameter ($0 \le complexity \le 1$) that allows one to modulate the 116 a global control parameter ($0 \le complexity \le 1$) that allows one to modulate the
115 amount of deformation or noise introduced. 117 amount of deformation or noise introduced.
116 118
117 We can differentiate two important parts in the pipeline. The first one, 119 There are two main parts in the pipeline. The first one,
118 from slant to pinch, performs transformations of the character. The second 120 from slant to pinch below, performs transformations. The second
119 part, from blur to contrast, adds noise to the image. 121 part, from blur to contrast, adds different kinds of noise.
120 122
121 \subsection{Slant} 123 {\large\bf Transformations}\\
122 124 {\bf Slant}\\
123 In order to mimic a slant effect, we simply shift each row of the image 125 We mimic slant by shifting each row of the image
124 proportionnaly to its height: $shift = round(slant \times height)$. We 126 proportionnaly to its height: $shift = round(slant \times height)$.
125 round the shift in order to have a discret displacement. We do not use a
126 filter to smooth the result in order to save computing time and also
127 because latter transformations have similar effects.
128
129 The $slant$ coefficient can be negative or positive with equal probability 127 The $slant$ coefficient can be negative or positive with equal probability
130 and its value is randomly sampled according to the complexity level. In 128 and its value is randomly sampled according to the complexity level:
131 our case we take uniformly a number in the range $[0,complexity]$, so the 129 e $slant \sim U[0,complexity]$, so the
132 maximum displacement for the lowest or highest pixel line is of 130 maximum displacement for the lowest or highest pixel line is of
133 $round(complexity \times 32)$. 131 $round(complexity \times 32)$.\\
134 132 {\bf Thickness}\\
135 133 Morpholigical operators of dilation and erosion~\cite{Haralick87,Serra82}
136 \subsection{Thickness} 134 are applied. The neighborhood of each pixel is multiplied
137 135 element-wise with a {\em structuring element} matrix.
138 To change the thickness of the characters we used morpholigical operators: 136 The pixel value is replaced by the maximum or the minimum of the resulting
139 dilation and erosion~\cite{Haralick87,Serra82}. 137 matrix, respectively for dilation or erosion. Ten different structural elements with
140 138 increasing dimensions (largest is $5\times5$) were used. For each image,
141 The basic idea of such transform is, for each pixel, to multiply in the 139 randomly sample the operator type (dilation or erosion) with equal probability and one structural
142 element-wise manner its neighbourhood with a matrix called the structuring
143 element. Then for dilation we remplace the pixel value by the maximum of
144 the result, or the minimum for erosion. This will dilate or erode objects
145 in the image and the strength of the transform only depends on the
146 structuring element.
147
148 We used ten different structural elements with increasing dimensions (the
149 biggest is $5\times5$). for each image, we radomly sample the operator
150 type (dilation or erosion) with equal probability and one structural
151 element from a subset of the $n$ smallest structuring elements where $n$ is 140 element from a subset of the $n$ smallest structuring elements where $n$ is
152 $round(10 \times complexity)$ for dilation and $round(6 \times complexity)$ 141 $round(10 \times complexity)$ for dilation and $round(6 \times complexity)$
153 for erosion. A neutral element is always present in the set, if it is 142 for erosion. A neutral element is always present in the set, and if it is
154 chosen the transformation is not applied. Erosion allows only the six 143 chosen no transformation is applied. Erosion allows only the six
155 smallest structural elements because when the character is too thin it may 144 smallest structural elements because when the character is too thin it may
156 erase it completly. 145 be completely erased.\\
157 146 {\bf Affine Transformations}\\
158 \subsection{Affine Transformations} 147 A $2 \times 3$ affine transform matrix (with
159 148 6 parameters $(a,b,c,d,e,f)$) is sampled according to the $complexity$ level.
160 We generate an affine transform matrix according to the complexity level, 149 Each pixel $(x,y)$ of the output image takes the value of the pixel
161 then we apply it directly to the image. The matrix is of size $2 \times 150 nearest to $(ax+by+c,dx+ey+f)$ in the input image. This
162 3$, so we can represent it by six parameters $(a,b,c,d,e,f)$. Formally, 151 produces scaling, translation, rotation and shearing.
163 for each pixel $(x,y)$ of the output image, we give the value of the pixel 152 The marginal distributions of $(a,b,c,d,e,f)$ have been tuned by hand to
164 nearest to : $(ax+by+c,dx+ey+f)$, in the input image. This allows to
165 produce scaling, translation, rotation and shearing variances.
166
167 The sampling of the parameters $(a,b,c,d,e,f)$ have been tuned by hand to
168 forbid important rotations (not to confuse classes) but to give good 153 forbid important rotations (not to confuse classes) but to give good
169 variability of the transformation. For each image we sample uniformly the 154 variability of the transformation: $a$ and $d$ $\sim U[1-3 \times
170 parameters in the following ranges: $a$ and $d$ in $[1-3 \times 155 complexity,1+3 \times complexity]$, $b$ and $e$ $\sim[-3 \times complexity,3
171 complexity,1+3 \times complexity]$, $b$ and $e$ in $[-3 \times complexity,3 156 \times complexity]$ and $c$ and $f$ $\sim U[-4 \times complexity, 4 \times
172 \times complexity]$ and $c$ and $f$ in $[-4 \times complexity, 4 \times 157 complexity]$.\\
173 complexity]$. 158 {\bf Local Elastic Deformations}\\
174 159 This filter induces a "wiggly" effect in the image, following~\cite{SimardSP03},
175 160 which provides more details.
176 \subsection{Local Elastic Deformations} 161 Two "displacements" fields are generated and applied, for horizontal
177 162 and vertical displacements of pixels.
178 This filter induces a "wiggly" effect in the image. The description here
179 will be brief, as the algorithm follows precisely what is described in
180 \cite{SimardSP03}.
181
182 The general idea is to generate two "displacements" fields, for horizontal
183 and vertical displacements of pixels. Each of these fields has the same
184 size as the original image.
185
186 When generating the transformed image, we'll loop over the x and y
187 positions in the fields and select, as a value, the value of the pixel in
188 the original image at the (relative) position given by the displacement
189 fields for this x and y. If the position we'd retrieve is outside the
190 borders of the image, we use a 0 value instead.
191
192 To generate a pixel in either field, first a value between -1 and 1 is 163 To generate a pixel in either field, first a value between -1 and 1 is
193 chosen from a uniform distribution. Then all the pixels, in both fields, is 164 chosen from a uniform distribution. Then all the pixels, in both fields, are
194 multiplied by a constant $\alpha$ which controls the intensity of the 165 multiplied by a constant $\alpha$ which controls the intensity of the
195 displacements (bigger $\alpha$ translates into larger wiggles). 166 displacements (larger $\alpha$ translates into larger wiggles).
196 167 Each field is convoluted with a Gaussian 2D kernel of
197 As a final step, each field is convoluted with a Gaussian 2D kernel of 168 standard deviation $\sigma$. Visually, this results in a blur.
198 standard deviation $\sigma$. Visually, this results in a "blur"
199 filter. This has the effect of making values next to each other in the
200 displacement fields similar. In effect, this makes the wiggles more
201 coherent, less noisy.
202
203 As displacement fields were long to compute, 50 pairs of fields were
204 generated per complexity in increments of 0.1 (50 pairs for 0.1, 50 pairs
205 for 0.2, etc.), and afterwards, given a complexity, we selected randomly
206 among the 50 corresponding pairs.
207
208 $\sigma$ and $\alpha$ were linked to complexity through the formulas
209 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times 169 $\alpha = \sqrt[3]{complexity} \times 10.0$ and $\sigma = 10 - 7 \times
210 \sqrt[3]{complexity}$. 170 \sqrt[3]{complexity}$.\\
211 171 {\bf Pinch}\\
212 172 This GIMP filter is named "Whirl and
213 \subsection{Pinch} 173 pinch", but whirl was set to 0. A pinch is ``similar to projecting the image onto an elastic
214 174 surface and pressing or pulling on the center of the surface''~\cite{GIMP-manual}.
215 This is another GIMP filter we used. The filter is in fact named "Whirl and 175 For a square input image, think of drawing a circle of
216 pinch", but we don't use the "whirl" part (whirl is set to 0). As described
217 in GIMP, a pinch is "similar to projecting the image onto an elastic
218 surface and pressing or pulling on the center of the surface".
219
220 Mathematically, for a square input image, think of drawing a circle of
221 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to 176 radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to
222 that disk (region inside circle) will have its value recalculated by taking 177 that disk (region inside circle) will have its value recalculated by taking
223 the value of another "source" pixel in the original image. The position of 178 the value of another "source" pixel in the original image. The position of
224 that source pixel is found on the line thats goes through $C$ and $P$, but 179 that source pixel is found on the line thats goes through $C$ and $P$, but
225 at some other distance $d_2$. Define $d_1$ to be the distance between $P$ 180 at some other distance $d_2$. Define $d_1$ to be the distance between $P$
226 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times 181 and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times
227 d_1$, where $pinch$ is a parameter to the filter. 182 d_1$, where $pinch$ is a parameter to the filter.
228
229 If the region considered is not square then, before computing $d_2$, the
230 smallest dimension (x or y) is stretched such that we may consider the
231 region as if it was square. Then, after $d_2$ has been computed and
232 corresponding components $d_2\_x$ and $d_2\_y$ have been found, the
233 component corresponding to the stretched dimension is compressed back by an
234 inverse ratio.
235
236 The actual value is given by bilinear interpolation considering the pixels 183 The actual value is given by bilinear interpolation considering the pixels
237 around the (non-integer) source position thus found. 184 around the (non-integer) source position thus found.
238 185 Here $pinch \sim U[-complexity, 0.7 \times complexity]$.\\
239 The value for $pinch$ in our case was given by sampling from an uniform 186
240 distribution over the range $[-complexity, 0.7 \times complexity]$. 187 {\large\bf Injecting Noise}\\
241 188 {\bf Motion Blur}\\
242 \subsection{Motion Blur} 189 This GIMP filter is a ``linear motion blur'' in GIMP
243 190 terminology, with two parameters, $length$ and $angle$. The value of
244 This is a GIMP filter we applied, a "linear motion blur" in GIMP 191 a pixel in the final image is the approximately mean value of the $length$ first pixels
245 terminology. The description will be brief as it is a well-known filter. 192 found by moving in the $angle$ direction.
246 193 Here $angle \sim U[0,360]$ degrees, and $length \sim {\rm Normal}(0,(3 \times complexity)^2)$.\\
247 This algorithm has two input parameters, $length$ and $angle$. The value of 194 {\bf Occlusion}\\
248 a pixel in the final image is the mean value of the $length$ first pixels 195 This filter selects a random rectangle from an {\em occluder} character
249 found by moving in the $angle$ direction. An approximation of this idea is 196 images and places it over the original {\em occluded} character
250 used, as we won't fall onto precise pixels by following that 197 image. Pixels are combined by taking the max(occluder,occluded),
251 direction. This is done using the Bresenham line algorithm. 198 closer to black. The corners of the occluder The rectangle corners
252 199 are sampled so that larger complexity gives larger rectangles.
253 The angle, in our case, is chosen from a uniform distribution over 200 The destination position in the occluded image are also sampled
254 $[0,360]$ degrees. The length, though, depends on the complexity; it's 201 according to a normal distribution (see more details in~\cite{ift6266-tr-anonymous}.
255 sampled from a Gaussian distribution of mean 0 and standard deviation 202 It has has a probability of not being applied at all of 60\%.\\
256 $\sigma = 3 \times complexity$. 203 {\bf Pixel Permutation}\\
257 204 This filter permutes neighbouring pixels. It selects first
258 \subsection{Occlusion}
259
260 This filter selects random parts of other (hereafter "occlusive") letter
261 images and places them over the original letter (hereafter "occluded")
262 image. To be more precise, having selected a subregion of the occlusive
263 image and a desination position in the occluded image, to determine the
264 final value for a given overlapping pixel, it selects whichever pixel is
265 the lightest. As a reminder, the background value is 0, black, so the value
266 nearest to 1 is selected.
267
268 To select a subpart of the occlusive image, four numbers are generated. For
269 compability with the code, we'll call them "haut", "bas", "gauche" and
270 "droite" (respectively meaning top, bottom, left and right). Each of these
271 numbers is selected according to a Gaussian distribution of mean $8 \times
272 complexity$ and standard deviation $2$. This means the largest the
273 complexity is, the biggest the occlusion will be. The absolute value is
274 taken, as the numbers must be positive, and the maximum value is capped at
275 15.
276
277 These four sizes collectively define a window centered on the middle pixel
278 of the occlusive image. This is the part that will be extracted as the
279 occlusion.
280
281 The next step is to select a destination position in the occluded
282 image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$
283 are selected according to Gaussian distributions of mean 0 and of standard
284 deviations of, respectively, 3 and 2. Then an horizontal placement mode,
285 $place$, is selected to be of three values meaning
286 left, middle or right.
287
288 If $place$ is "middle", the occlusion will be horizontally centered
289 around the horizontal middle of the occluded image, then shifted according
290 to $x\_arrivee$. If $place$ is "left", it will be placed on the left of
291 the occluded image, then displaced right according to $x\_arrivee$. The
292 contrary happens if $place$ is $right$.
293
294 In both the horizontal and vertical positionning, the maximum position in
295 either direction is such that the selected occlusion won't go beyond the
296 borders of the occluded image.
297
298 This filter has a probability of not being applied, at all, of 60\%.
299
300
301 \subsection{Pixel Permutation}
302
303 This filter permuts neighbouring pixels. It selects first
304 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then 205 $\frac{complexity}{3}$ pixels randomly in the image. Each of them are then
305 sequentially exchanged to one other pixel in its $V4$ neighbourhood. Number 206 sequentially exchanged to one other pixel in its $V4$ neighbourhood. Number
306 of exchanges to the left, right, top, bottom are equal or does not differ 207 of exchanges to the left, right, top, bottom are equal or does not differ
307 from more than 1 if the number of selected pixels is not a multiple of 4. 208 from more than 1 if the number of selected pixels is not a multiple of 4.
308 209 It has has a probability of not being applied at all of 80\%.\\
309 It has has a probability of not being applied, at all, of 80\%. 210 {\bf Gaussian Noise}\\
310
311
312 \subsection{Gaussian Noise}
313
314 This filter simply adds, to each pixel of the image independently, a 211 This filter simply adds, to each pixel of the image independently, a
315 Gaussian noise of mean $0$ and standard deviation $\frac{complexity}{10}$. 212 noise $\sim Normal(0(\frac{complexity}{10})^2)$.
316 213 It has has a probability of not being applied at all of 70\%.\\
317 It has has a probability of not being applied, at all, of 70\%. 214 {\bf Background Images}\\
318
319
320 \subsection{Background Images}
321
322 Following~\cite{Larochelle-jmlr-2009}, this transformation adds a random 215 Following~\cite{Larochelle-jmlr-2009}, this transformation adds a random
323 background behind the letter. The background is chosen by first selecting, 216 background behind the letter. The background is chosen by first selecting,
324 at random, an image from a set of images. Then we choose a 32x32 subregion 217 at random, an image from a set of images. Then a 32$\times$32 subregion
325 of that image as the background image (by sampling x and y positions 218 of that image is chosen as the background image (by sampling position
326 uniformly while making sure not to cross image borders). 219 uniformly while making sure not to cross image borders).
327
328 To combine the original letter image and the background image, contrast 220 To combine the original letter image and the background image, contrast
329 adjustments are made. We first get the maximal values (i.e. maximal 221 adjustments are made. We first get the maximal values (i.e. maximal
330 intensity) for both the original image and the background image, $maximage$ 222 intensity) for both the original image and the background image, $maximage$
331 and $maxbg$. We also have a parameter, $contrast$, given by sampling from a 223 and $maxbg$. We also have a parameter $contrast \sim U[complexity, 1]$.
332 uniform distribution over $[complexity, 1]$. 224 Each background pixel value is multiplied by $\frac{max(maximage -
333 225 contrast, 0)}{maxbg}$ (higher contrast yield darker
334 Once we have all these numbers, we first adjust the values for the 226 background). The output image pixels are max(background,original).\\
335 background image. Each pixel value is multiplied by $\frac{max(maximage - 227 {\bf Salt and Pepper Noise}\\
336 contrast, 0)}{maxbg}$. Therefore the higher the contrast, the darkest the 228 This filter adds noise $\sim U[0,1]$ to random subsets of pixels.
337 background will be. 229 The number of selected pixels is $0.2 \times complexity$.
338 230 This filter has a probability of not being applied at all of 75\%.\\
339 The final image is found by taking the brightest (i.e. value nearest to 1) 231 {\bf Spatially Gaussian Noise}\\
340 pixel from either the background image or the corresponding pixel in the 232 Different regions of the image are spatially smoothed.
341 original image. 233 The image is convolved with a symmetric Gaussian kernel of
342 234 size and variance choosen uniformly in the ranges $[12,12 + 20 \times
343 \subsection{Salt and Pepper Noise}
344
345 This filter adds noise to the image by randomly selecting a certain number
346 of them and, for those selected pixels, assign a random value according to
347 a uniform distribution over the $[0,1]$ ranges. This last distribution does
348 not change according to complexity. Instead, the number of selected pixels
349 does: the proportion of changed pixels corresponds to $complexity / 5$,
350 which means, as a maximum, 20\% of the pixels will be randomized. On the
351 lowest extreme, no pixel is changed.
352
353 This filter also has a probability of not being applied, at all, of 75\%.
354
355 \subsection{Spatially Gaussian Noise}
356
357 The aim of this transformation is to filter, with a gaussian kernel,
358 different regions of the image. In order to save computing time we decided
359 to convolve the whole image only once with a symmetric gaussian kernel of
360 size and variance choosen uniformly in the ranges: $[12,12 + 20 \times
361 complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized 235 complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized
362 between $0$ and $1$. We also create a symmetric averaging window, of the 236 between $0$ and $1$. We also create a symmetric averaging window, of the
363 kernel size, with maximum value at the center. For each image we sample 237 kernel size, with maximum value at the center. For each image we sample
364 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be 238 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be
365 averaging centers between the original image and the filtered one. We 239 averaging centers between the original image and the filtered one. We
366 initialize to zero a mask matrix of the image size. For each selected pixel 240 initialize to zero a mask matrix of the image size. For each selected pixel
367 we add to the mask the averaging window centered to it. The final image is 241 we add to the mask the averaging window centered to it. The final image is
368 computed from the following element-wise operation: $\frac{image + filtered 242 computed from the following element-wise operation: $\frac{image + filtered
369 image \times mask}{mask+1}$. 243 image \times mask}{mask+1}$.
370 244 This filter has a probability of not being applied at all of 75\%.\\
371 This filter has a probability of not being applied, at all, of 75\%. 245 {\bf Scratches}\\
372
373 \subsection{Scratches}
374
375 The scratches module places line-like white patches on the image. The 246 The scratches module places line-like white patches on the image. The
376 lines are in fact heavily transformed images of the digit "1" (one), chosen 247 lines are heavily transformed images of the digit "1" (one), chosen
377 at random among five thousands such start images of this digit. 248 at random among five thousands such 1 images. The 1 image is
378 249 randomly cropped and rotated by an angle $\sim Normal(0,(100 \times
379 Once the image is selected, the transformation begins by finding the first 250 complexity)^2$, using bicubic interpolation,
380 $top$, $bottom$, $right$ and $left$ non-zero pixels in the image. It is 251 Two passes of a greyscale morphological erosion filter
381 then cropped to the region thus delimited, then this cropped version is 252 are applied, reducing the width of the line
382 expanded to $32\times32$ again. It is then rotated by a random angle having a 253 by an amount controlled by $complexity$.
383 Gaussian distribution of mean 90 and standard deviation $100 \times
384 complexity$ (in degrees). The rotation is done with bicubic interpolation.
385
386 The rotated image is then resized to $50\times50$, with anti-aliasing. In
387 that image, we crop the image again by selecting a region delimited
388 horizontally to $left$ to $left+32$ and vertically by $top$ to $top+32$.
389
390 Once this is done, two passes of a greyscale morphological erosion filter
391 are applied. Put briefly, this erosion filter reduces the width of the line
392 by a certain $smoothing$ amount. For small complexities (< 0.5),
393 $smoothing$ is 6, so the line is very small. For complexities ranging from
394 0.25 to 0.5, $smoothing$ is 5. It is 4 for complexities 0.5 to 0.75, and 3
395 for higher complexities.
396
397 To compensate for border effects, the image is then cropped to 28x28 by
398 removing two pixels everywhere on the borders, then expanded to 32x32
399 again. The pixel values are then linearly expanded such that the minimum
400 value is 0 and the maximal one is 1. Then, 50\% of the time, the image is
401 vertically flipped.
402
403 This filter is only applied only 15\% of the time. When it is applied, 50\% 254 This filter is only applied only 15\% of the time. When it is applied, 50\%
404 of the time, only one patch image is generated and applied. In 30\% of 255 of the time, only one patch image is generated and applied. In 30\% of
405 cases, two patches are generated, and otherwise three patches are 256 cases, two patches are generated, and otherwise three patches are
406 generated. The patch is applied by taking the maximal value on any given 257 generated. The patch is applied by taking the maximal value on any given
407 patch or the original image, for each of the 32x32 pixel locations. 258 patch or the original image, for each of the 32x32 pixel locations.\\
408 259 {\bf Color and Contrast Changes}\\
409 \subsection{Color and Contrast Changes}
410
411 This filter changes the constrast and may invert the image polarity (white 260 This filter changes the constrast and may invert the image polarity (white
412 on black to black on white). The contrast $C$ is defined here as the 261 on black to black on white). The contrast $C$ is defined here as the
413 difference between the maximum and the minimum pixel value of the image. A 262 difference between the maximum and the minimum pixel value of the image.
414 contrast value is sampled uniformly between $1$ and $1-0.85 \times 263 Contrast $\sim U[1-0.85 \times complexity,1]$ (so constrast $\geq 0.15$).
415 complexity$ (this insure a minimum constrast of $0.15$). We then simply 264 The image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The
416 normalize the image to the range $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The
417 polarity is inverted with $0.5$ probability. 265 polarity is inverted with $0.5$ probability.
418 266
419 267
420 \begin{figure}[h] 268 \begin{figure}[h]
421 \resizebox{.99\textwidth}{!}{\includegraphics{images/example_t.png}}\\ 269 \resizebox{.99\textwidth}{!}{\includegraphics{images/example_t.png}}\\
558 SDA1 & 17.1\% $\pm$.13\% & 29.7\%$\pm$.3\% & 29.7\%$\pm$.3\% & 1.4\% $\pm$.1\%\\ \hline 406 SDA1 & 17.1\% $\pm$.13\% & 29.7\%$\pm$.3\% & 29.7\%$\pm$.3\% & 1.4\% $\pm$.1\%\\ \hline
559 SDA2 & 18.7\% $\pm$.13\% & 33.6\%$\pm$.3\% & 39.9\%$\pm$.17\% & 1.7\% $\pm$.1\%\\ \hline 407 SDA2 & 18.7\% $\pm$.13\% & 33.6\%$\pm$.3\% & 39.9\%$\pm$.17\% & 1.7\% $\pm$.1\%\\ \hline
560 MLP0 & 24.2\% $\pm$.15\% & 68.8\%$\pm$.33\% & 78.70\%$\pm$.14\% & 3.45\% $\pm$.15\% \\ \hline 408 MLP0 & 24.2\% $\pm$.15\% & 68.8\%$\pm$.33\% & 78.70\%$\pm$.14\% & 3.45\% $\pm$.15\% \\ \hline
561 MLP1 & 23.0\% $\pm$.15\% & 41.8\%$\pm$.35\% & 90.4\%$\pm$.1\% & 3.85\% $\pm$.16\% \\ \hline 409 MLP1 & 23.0\% $\pm$.15\% & 41.8\%$\pm$.35\% & 90.4\%$\pm$.1\% & 3.85\% $\pm$.16\% \\ \hline
562 MLP2 & 24.3\% $\pm$.15\% & 46.0\%$\pm$.35\% & 54.7\%$\pm$.17\% & 4.85\% $\pm$.18\% \\ \hline 410 MLP2 & 24.3\% $\pm$.15\% & 46.0\%$\pm$.35\% & 54.7\%$\pm$.17\% & 4.85\% $\pm$.18\% \\ \hline
563 [5] & & & & 4.95\% $\pm$.18\% \\ \hline 411 \cite{Granger+al-2007} & & & & 4.95\% $\pm$.18\% \\ \hline
564 [2] & & & & 3.71\% $\pm$.16\% \\ \hline 412 \cite{Cortes+al-2000} & & & & 3.71\% $\pm$.16\% \\ \hline
565 [3] & & & & 2.4\% $\pm$.13\% \\ \hline 413 \cite{Oliveira+al-2002} & & & & 2.4\% $\pm$.13\% \\ \hline
566 [4] & & & & 2.1\% $\pm$.12\% \\ \hline 414 \cite{Migram+al-2005} & & & & 2.1\% $\pm$.12\% \\ \hline
567 \end{tabular} 415 \end{tabular}
568 \end{center} 416 \end{center}
569 \end{table} 417 \end{table}
570 418
571 \subsection{Perturbed Training Data More Helpful for SDAE} 419 \subsection{Perturbed Training Data More Helpful for SDAE}