Mercurial > ift6266
comparison writeup/aistats2011_cameraready.tex @ 638:677d1b1d8158
fits
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Sat, 19 Mar 2011 23:11:17 -0400 |
parents | fe98896745a5 |
children | 507cb92d8e15 |
comparison
equal
deleted
inserted
replaced
637:fe98896745a5 | 638:677d1b1d8158 |
---|---|
433 | 433 |
434 {\bf Multi-Layer Perceptrons (MLP).} The MLP output estimated with | 434 {\bf Multi-Layer Perceptrons (MLP).} The MLP output estimated with |
435 \[ | 435 \[ |
436 P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)), | 436 P({\rm class}|{\rm input}=x)={\rm softmax}(b_2+W_2\tanh(b_1+W_1 x)), |
437 \] | 437 \] |
438 i.e., two layers, where | 438 i.e., two layers, where $p={\rm softmax}(a)$ means that |
439 \[ | 439 $p_i(x)=\exp(a_i)/\sum_j \exp(a_j)$ |
440 p={\rm softmax}(a) | |
441 \] | |
442 means that | |
443 \[ | |
444 p_i(x)=\exp(a_i)/\sum_j \exp(a_j) | |
445 \] | |
446 representing the probability | 440 representing the probability |
447 for class $i$, $\tanh$ is the element-wise | 441 for class $i$, $\tanh$ is the element-wise |
448 hyperbolic tangent, $b_i$ are parameter vectors, and $W_i$ are | 442 hyperbolic tangent, $b_i$ are parameter vectors, and $W_i$ are |
449 parameter matrices (one per layer). The | 443 parameter matrices (one per layer). The |
450 number of rows of $W_1$ is called the number of hidden units (of the | 444 number of rows of $W_1$ is called the number of hidden units (of the |