ift6266: writeup/aistats2011_cameraready.tex comparison

eqns

author	Yoshua Bengio <bengioy@iro.umontreal.ca>
date	Sat, 19 Mar 2011 23:01:46 -0400
parents	54e8958e963b
children	fe98896745a5

comparison

equal deleted inserted replaced

-:d2d7ce0f0942
+:83d53ffe3f25
 The experiments are performed using MLPs (with a single
 hidden layer) and deep SDAs.
 \emph{Hyper-parameters are selected based on the {\bf NISTP} validation set error.}
-{\bf Multi-Layer Perceptrons (MLP).}  The MLP output estimated
+{\bf Multi-Layer Perceptrons (MLP).}  The MLP output estimated with
 \[
-P({\rm class}|{\rm input}=x)
+P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)),
-\]
-with
-\[
-f(x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)),
 \]
 i.e., two layers, where
 \[
 p={\rm softmax}(a)
 \]
 Auto-encoder is presented with a stochastically corrupted version $\tilde{x}$
 of the input $x$ and trained to reconstruct to produce a reconstruction $z$
 of the uncorrupted input $x$. Because the network has to denoise, it is
 forcing the hidden units $y$ to represent the leading regularities in
 the data. Following~\citep{VincentPLarochelleH2008-very-small}
-the hidden units output $y$ is obtained through
+the hidden units output $y$ is obtained through the sigmoid-affine
+encoder
 \[
 y={\rm sigm}(c+V x)
 \]
 where ${\rm sigm}(a)=1/(1+\exp(-a))$
-and the reconstruction is
+and the reconstruction is obtained through the same transformation
 \[
-z={\rm sigm}(d+V' y).
+z={\rm sigm}(d+V' y)
 \]
+but using the transpose of the encoder weights.
 We minimize the training
 set average of the cross-entropy
 reconstruction error
 \[
 L_H(x,z)=\sum_i z_i \log x_i + (1-z_i) \log(1-x_i).

Mercurial > ift6266