# HG changeset patch
# User Xavier Glorot <glorotxa@iro.umontreal.ca>
# Date 1272659383 14400
# Node ID ace489930918351636b97cace7926fc45eb45002
# Parent  a7fab59de174ac12868434e5616aad7f7775b77d# Parent  6e5f0f50ddaba3eeccbb63671e7ceb28b992b908
merge

diff -r a7fab59de174 -r ace489930918 writeup/techreport.tex
--- a/writeup/techreport.tex	Fri Apr 30 16:29:17 2010 -0400
+++ b/writeup/techreport.tex	Fri Apr 30 16:29:43 2010 -0400
@@ -214,6 +214,19 @@
 
 \section{Learning Algorithms for Deep Architectures}
 
+Learning for deep network has long been a problem since well-known learning algorithms do not generalize well on deep architectures.
+Using these training algorithms on deep network usually yields to a worse generalization than on shallow networks.
+Recently, new initialization techniques have been discovered that enable better generalization overall.
+
+One of these initialization techniques is denoising auto-encoders.
+The principle is that each layer starting from the bottom is trained to encode and decode their input and the encoding part is kept as initialization for the weights and bias of the network.
+For more details see section \ref{SdA}.
+
+After initialization is done, standard training algorithms work.
+In this case, since we have large data sets we use stochastic gradient descent.
+This resemble minibatch training except that the batches are selected at random.
+To speed up computation, we randomly pre-arranged examples in batches and used those for all training experiments.
+
 \section{Experimental Setup}
 
 \subsection{Training Datasets}
@@ -263,9 +276,11 @@
 The parameters to adapt are the weight matrix and the bias vector for each layer.
 
 \subsubsection{Stacked Denoising Auto-Encoders (SDAE)}
+\label{SdA}
 
 Auto-encoders are essentially a way to initialize the weights of the network to enable better generalization.
-Denoising auto-encoders are a variant where the input is corrupted with random noise before trying to repair it.
+This is essentially unsupervised training where the layer is made to reconstruct its input through and encoding and decoding phase.
+Denoising auto-encoders are a variant where the input is corrupted with random noise but the target is the uncorrupted input.
 The principle behind these initialization methods is that the network will learn the inherent relation between portions of the data and be able to represent them thus helping with whatever task we want to perform.
 
 An auto-encoder unit is formed of two MLP layers with the bottom one called the encoding layer and the top one the decoding layer.