ift6266: writeup/mlj_submission.tex comparison

comparison writeup/mlj_submission.tex @ 587:b1be957dd1be

Added mlj_submission to group every file needed for that.

author	fsavard
date	Thu, 30 Sep 2010 17:51:02 -0400
parents	4933077b8676
children	9a6abcf143e8

comparison

equal deleted inserted replaced

-:4933077b8676
+:b1be957dd1be
-\documentclass{article} % For LaTeX2e
+\RequirePackage{fix-cm} % from template
+%\documentclass{article} % For LaTeX2e
+\documentclass[smallcondensed]{svjour3}     % onecolumn (ditto)
 \usepackage{times}
 \usepackage{wrapfig}
-\usepackage{amsthm,amsmath,bbm}
+%\usepackage{amsthm} % not to be used with springer tools
+\usepackage{amsmath}
+\usepackage{bbm}
 \usepackage[psamsfonts]{amssymb}
-\usepackage{algorithm,algorithmic}
+%\usepackage{algorithm,algorithmic} % not used after all
 \usepackage[utf8]{inputenc}
 \usepackage{graphicx,subfigure}
-\usepackage[numbers]{natbib}
+\usepackage{natbib} % was [numbers]{natbib}
 \addtolength{\textwidth}{10mm}
 \addtolength{\evensidemargin}{-5mm}
 \addtolength{\oddsidemargin}{-5mm}
 %\setlength\parindent{0mm}
 \title{Deep Self-Taught Learning for Handwritten Character Recognition}
 \author{
+Yoshua  Bengio \and
 Frédéric  Bastien \and
-Yoshua  Bengio \and
 Arnaud  Bergeron \and
 Nicolas  Boulanger-Lewandowski \and
 Thomas  Breuel \and
 Youssouf  Chherawala \and
 Moustapha  Cisse \and
 Salah  Rifai \and
 Francois  Savard \and
 Guillaume  Sicard
 }
 \date{September 30th, submission to MLJ special issue on learning from multi-label data}
+\journalname{Machine Learning Journal}
+\institute{Frédéric  Bastien \and \\
+		Yoshua  Bengio \and \\
+		Arnaud  Bergeron \and \\
+		Nicolas  Boulanger-Lewandowski \and \\
+		Youssouf  Chherawala \and \\
+		Moustapha  Cisse \and \\
+		Myriam  Côté \and  \\
+		Dumitru  Erhan \and \\
+		Jeremy  Eustache \and \\
+		Xavier  Glorot \and  \\
+		Xavier  Muller \and \\
+		Sylvain  Pannetier-Lebeuf \and \\
+		Razvan  Pascanu \and  \\
+		Salah  Rifai \and \\
+		Francois  Savard \and \\
+		Guillaume  Sicard \at
+	Dept. IRO, Universite de Montreal, C.P. 6128, Montreal, QC, H3C 3J7, Canada\\
+		\email{yoshua.bengio@umontreal.ca}
+	\and
+		Thomas  Breuel \at
+	Department of Computer Science, University of Kaiserslautern, Postfach 3049, 67653 Kaiserslautern, Germany
+}
 \begin{document}
 %\makeanontitle
 \maketitle
 %\vspace*{-2mm}
 \begin{abstract}
 Recent theoretical and empirical work in statistical machine learning has demonstrated the importance of learning algorithms for deep architectures, i.e., function classes obtained by composing multiple non-linear transformations. Self-taught learning (exploiting unlabeled examples or examples from other distributions) has already been applied to deep learners, but mostly to show the advantage of unlabeled examples. Here we explore the advantage brought by {\em out-of-distribution examples}.  For this purpose we developed a powerful generator of stochastic variations and noise processes for character images, including not only affine transformations but also slant, local elastic deformations, changes in thickness, background images, grey level changes, contrast, occlusion, and various types of noise. The out-of-distribution examples are obtained from these highly distorted images or by including examples of object classes different from those in the target test set.  We show that {\em deep learners benefit more from out-of-distribution examples than a corresponding shallow learner}, at least in the area of handwritten character recognition. In fact, we show that they beat previously published results and reach human-level performance on both handwritten digit classification and 62-class handwritten character recognition.
 \end{abstract}
 %\vspace*{-3mm}
 Keywords: self-taught learning, multi-task learning, out-of-distribution examples, deep learning, handwriting recognition.
 \section{Introduction}
 %\vspace*{-1mm}
 {\bf Deep Learning} has emerged as a promising new area of research in
-statistical machine learning (see~\citet{Bengio-2009} for a review).
+statistical machine learning (see \citet{Bengio-2009} for a review).
 Learning algorithms for deep architectures are centered on the learning
 of useful representations of data, which are better suited to the task at hand,
 and are organized in a hierarchy with multiple levels.
 This is in part inspired by observations of the mammalian visual cortex,
 which consists of a chain of processing elements, each of which is associated with a
 different representation of the raw visual input. In fact,
 it was found recently that the features learnt in deep architectures resemble
 those observed in the first two of these stages (in areas V1 and V2
-of visual cortex)~\citep{HonglakL2008}, and that they become more and
+of visual cortex) \citep{HonglakL2008}, and that they become more and
 more invariant to factors of variation (such as camera movement) in
 higher layers~\citep{Goodfellow2009}.
 Learning a hierarchy of features increases the
 ease and practicality of developing representations that are at once
 tailored to specific tasks, yet are able to borrow statistical strength
 basins of attraction are not discovered by pure supervised learning
 (with or without self-taught settings), and more labeled examples
 does not allow the model to go from the poorer basins of attraction discovered
 by the purely supervised shallow models to the kind of better basins associated
 with deep learning and self-taught learning.
 A Flash demo of the recognizer (where both the MLP and the SDA can be compared)
 can be executed on-line at {\tt http://deep.host22.com}.
 \section*{Appendix I: Detailed Numerical Results}
 \end{table}
 %\afterpage{\clearpage}
 \clearpage
 {
+\bibliographystyle{spbasic}      % basic style, author-year citations
 \bibliography{strings,strings-short,strings-shorter,ift6266_ml,specials,aigaion-shorter}
 %\bibliographystyle{plainnat}
-\bibliographystyle{unsrtnat}
+%\bibliographystyle{unsrtnat}
 %\bibliographystyle{apalike}
 }
 \end{document}

Mercurial > ift6266

comparison writeup/mlj_submission.tex @ 587:b1be957dd1be