Mercurial > ift6266
comparison writeup/contributions.tex @ 586:f5a198b2854a
contributions.tex
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Thu, 30 Sep 2010 17:43:48 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
585:4933077b8676 | 586:f5a198b2854a |
---|---|
1 \documentclass{article} % For LaTeX2e | |
2 \usepackage{times} | |
3 \usepackage{wrapfig} | |
4 \usepackage{amsthm,amsmath,bbm} | |
5 \usepackage[psamsfonts]{amssymb} | |
6 \usepackage{algorithm,algorithmic} | |
7 \usepackage[utf8]{inputenc} | |
8 \usepackage{graphicx,subfigure} | |
9 \usepackage[numbers]{natbib} | |
10 | |
11 \addtolength{\textwidth}{10mm} | |
12 \addtolength{\evensidemargin}{-5mm} | |
13 \addtolength{\oddsidemargin}{-5mm} | |
14 | |
15 %\setlength\parindent{0mm} | |
16 | |
17 \begin{document} | |
18 | |
19 \begin{center} | |
20 {\Large Deep Self-Taught Learning for Handwritten Character Recognition} | |
21 | |
22 {\bf \large Information on Main Contributions} | |
23 \end{center} | |
24 | |
25 \setlength{\parindent}{0cm} | |
26 | |
27 %\vspace*{-2mm} | |
28 \section*{Background and Related Contributions} | |
29 %\vspace*{-2mm} | |
30 %{\large \bf Background and Related Contributions} | |
31 | |
32 Recent theoretical and empirical work in statistical machine learning has | |
33 demonstrated the potential of learning algorithms for {\bf deep | |
34 architectures}, i.e., function classes obtained by composing multiple | |
35 levels of representation | |
36 \citep{Hinton06,ranzato-07-small,Bengio-nips-2006,VincentPLarochelleH2008,ranzato-08,Larochelle-jmlr-2009,Salakhutdinov+Hinton-2009,HonglakL2009,HonglakLNIPS2009,Jarrett-ICCV2009,Taylor-cvpr-2010}. | |
37 See~\citet{Bengio-2009} for a review of deep learning algorithms. | |
38 | |
39 {\bf Self-taught learning}~\citep{RainaR2007} is a paradigm that combines | |
40 principles of semi-supervised and multi-task learning: the learner can | |
41 exploit examples that are unlabeled and possibly come from a distribution | |
42 different from the target distribution, e.g., from other classes than those | |
43 of interest. Self-taught learning has already been applied to deep | |
44 learners, but mostly to show the advantage of unlabeled | |
45 examples~\citep{Bengio-2009,WestonJ2008-small}. | |
46 | |
47 There already are theoretical arguments~\citep{baxter95a} supporting the claim | |
48 that learning an {\bf intermediate representation} shared across tasks can be | |
49 beneficial for multi-task learning. It has also already been argued~\citep{Bengio-2009} | |
50 that {\bf multiple levels of representation} can bring a benefit over a single level. | |
51 | |
52 %{\large \bf Main Claim} | |
53 %\vspace*{-2mm} | |
54 \section*{Main Claim} | |
55 %\vspace*{-2mm} | |
56 | |
57 We claim that deep learners, with several levels of representation, can | |
58 benefit more from self-taught learning than shallow learners (with a single | |
59 level), both in the context of the multi-task setting and from {\em | |
60 out-of-distribution examples} in general. | |
61 | |
62 %{\large \bf Contribution to Machine Learning} | |
63 %\vspace*{-2mm} | |
64 \section*{Contribution to Machine Learning} | |
65 %\vspace*{-2mm} | |
66 | |
67 We show evidence for the above claim in a large-scale setting, with | |
68 a training set consisting of hundreds of millions of examples, in the | |
69 context of handwritten character recognition with 62 classes (upper-case, | |
70 lower-case, digits). | |
71 | |
72 %{\large \bf Evidence to Support the Claim} | |
73 %\vspace*{-2mm} | |
74 \section*{Evidence to Support the Claim} | |
75 %\vspace*{-2mm} | |
76 | |
77 In the above experimental setting, we show that {\em deep learners benefited | |
78 significantly more from the multi-task setting than a corresponding shallow | |
79 learner}. and that they benefited more from {\em distorted (out-of-distribution) examples} | |
80 (i.e. from a distribution larger than the one from which test examples come from). | |
81 | |
82 In addition, we show that they {\em beat previously published results} on this task | |
83 (the MNIST special database 19) | |
84 and {\bf reach human-level performance} on both handwritten digit classification and | |
85 62-class handwritten character recognition. | |
86 | |
87 \newpage | |
88 | |
89 {\small | |
90 \bibliography{strings,strings-short,strings-shorter,ift6266_ml,specials,aigaion-shorter} | |
91 %\bibliographystyle{plainnat} | |
92 \bibliographystyle{unsrtnat} | |
93 %\bibliographystyle{apalike} | |
94 } | |
95 | |
96 | |
97 \end{document} |