comparison writeup/aistats2011_cameraready.tex @ 638:677d1b1d8158

fits
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Sat, 19 Mar 2011 23:11:17 -0400
parents fe98896745a5
children 507cb92d8e15
comparison
equal deleted inserted replaced
637:fe98896745a5 638:677d1b1d8158
433 433
434 {\bf Multi-Layer Perceptrons (MLP).} The MLP output estimated with 434 {\bf Multi-Layer Perceptrons (MLP).} The MLP output estimated with
435 \[ 435 \[
436 P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)), 436 P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)),
437 \] 437 \]
438 i.e., two layers, where 438 i.e., two layers, where $p={\rm softmax}(a)$ means that
439 \[ 439 $p_i(x)=\exp(a_i)/\sum_j \exp(a_j)$
440 p={\rm softmax}(a)
441 \]
442 means that
443 \[
444 p_i(x)=\exp(a_i)/\sum_j \exp(a_j)
445 \]
446 representing the probability 440 representing the probability
447 for class $i$, $\tanh$ is the element-wise 441 for class $i$, $\tanh$ is the element-wise
448 hyperbolic tangent, $b_i$ are parameter vectors, and $W_i$ are 442 hyperbolic tangent, $b_i$ are parameter vectors, and $W_i$ are
449 parameter matrices (one per layer). The 443 parameter matrices (one per layer). The
450 number of rows of $W_1$ is called the number of hidden units (of the 444 number of rows of $W_1$ is called the number of hidden units (of the