Mercurial > ift6266

--- a/writeup/aistats2011_cameraready.tex	Sat Mar 19 22:58:06 2011 -0400
+++ b/writeup/aistats2011_cameraready.tex	Sat Mar 19 23:01:46 2011 -0400
@@ -431,13 +431,9 @@
 hidden layer) and deep SDAs.
 \emph{Hyper-parameters are selected based on the {\bf NISTP} validation set error.}

-{\bf Multi-Layer Perceptrons (MLP).}  The MLP output estimated
+{\bf Multi-Layer Perceptrons (MLP).}  The MLP output estimated with
 \[
-P({\rm class}|{\rm input}=x)
-\]
-with
-\[
-f(x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)),
+P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)),
 \]
 i.e., two layers, where
 \[
@@ -519,15 +515,17 @@
 of the uncorrupted input $x$. Because the network has to denoise, it is
 forcing the hidden units $y$ to represent the leading regularities in
 the data. Following~\citep{VincentPLarochelleH2008-very-small}
-the hidden units output $y$ is obtained through
+the hidden units output $y$ is obtained through the sigmoid-affine
+encoder
 \[
  y={\rm sigm}(c+V x)
 \]
 where ${\rm sigm}(a)=1/(1+\exp(-a))$
-and the reconstruction is
+and the reconstruction is obtained through the same transformation
 \[
- z={\rm sigm}(d+V' y).
+ z={\rm sigm}(d+V' y)
 \]
+but using the transpose of the encoder weights.
 We minimize the training
 set average of the cross-entropy
 reconstruction error