comparison writeup/techreport.tex @ 438:a6d339033d03

added AMT
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Mon, 03 May 2010 07:46:18 -0400
parents 479f2f518fc9
children 89258bb41e4c
comparison
equal deleted inserted replaced
437:479f2f518fc9 438:a6d339033d03
29 with this pipeline, using the hundreds of millions of generated examples 29 with this pipeline, using the hundreds of millions of generated examples
30 and testing on the full NIST test set. 30 and testing on the full NIST test set.
31 We find that the SDA outperforms its 31 We find that the SDA outperforms its
32 shallow counterpart, an ordinary Multi-Layer Perceptron, 32 shallow counterpart, an ordinary Multi-Layer Perceptron,
33 and that it is better able to take advantage of the additional 33 and that it is better able to take advantage of the additional
34 generated data. 34 generated data, as well as better able to take advantage of
35 training from more classes than those of interest in the end.
36 In fact, we find that the SDA reaches human performance as
37 estimated by the Amazon Mechanical Turk on the NIST test characters.
35 \end{abstract} 38 \end{abstract}
36 39
37 \section{Introduction} 40 \section{Introduction}
38 41
39 Deep Learning has emerged as a promising new area of research in 42 Deep Learning has emerged as a promising new area of research in
323 The second and following layers receive the same treatment except that they take as input the encoded version of the data that has gone through the layers before it. 326 The second and following layers receive the same treatment except that they take as input the encoded version of the data that has gone through the layers before it.
324 For additional details see \cite{vincent:icml08}. 327 For additional details see \cite{vincent:icml08}.
325 328
326 \section{Experimental Results} 329 \section{Experimental Results}
327 330
328 \subsection{SDA vs MLP} 331 \subsection{SDA vs MLP vs Humans}
329 332
333 We compare here the best MLP (according to validation set error) that we found against
334 the best SDA (again according to validation set error), along with a precise estimate
335 of human performance obtained via Amazon's Mechanical Turk (AMT)
336 service\footnote{http://mturk.com}. AMT users are paid small amounts
337 of money to perform tasks for which human intelligence is required.
338 Mechanical Turk has been used extensively in natural language
339 processing \cite{SnowEtAl2008} and vision
340 \cite{SorokinAndForsyth2008,whitehill09}. AMT users where presented
341 with 10 character images and asked to type 10 corresponding ascii
342 characters. Hence they were forced to make a hard choice among the
343 62 character classes. Three users classified each image, allowing
344 to estimate inter-human variability (shown as +/- in parenthesis below).
345
346 \begin{table}
347 \caption{Overall comparison of error rates on 62 character classes (10 digits +
348 26 lower + 26 upper), except for last columns -- digits only, between deep architecture with pre-training
349 (SDA=Stacked Denoising Autoencoder) and ordinary shallow architecture
350 (MLP=Multi-Layer Perceptron). }
351 \label{tab:sda-vs-mlp-vs-humans}
330 \begin{center} 352 \begin{center}
331 \begin{tabular}{lcc} 353 \begin{tabular}{|l|r|r|r|r|} \hline
332 & train w/ & train w/ \\ 354 & NIST test & NISTP test & P07 test & NIST test digits \\ \hline
333 & NIST & P07 + NIST \\ \hline 355 Humans& & & & \\ \hline
334 SDA & & \\ \hline 356 SDA & & & &\\ \hline
335 MLP & & \\ \hline 357 MLP & & & & \\ \hline
336 \end{tabular} 358 \end{tabular}
337 \end{center} 359 \end{center}
360 \end{table}
338 361
339 \subsection{Perturbed Training Data More Helpful for SDAE} 362 \subsection{Perturbed Training Data More Helpful for SDAE}
340 363
341 \subsection{Training with More Classes than Necessary} 364 \subsection{Training with More Classes than Necessary}
342 365