Mercurial > pylearn
annotate pylearn/algorithms/sigmoid_output_SdA.py @ 1476:8c10bda4bb5f
Configured default train/valid/test split for icml07.MNIST_rotated_background
dataset. Defaults are the ones used by Hugo in the ICML07 paper and in all
contracting auto-encoder papers.
author | gdesjardins |
---|---|
date | Fri, 20 May 2011 16:53:00 -0400 |
parents | daa355332b66 |
children |
rev | line source |
---|---|
939
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
1 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
2 This tutorial introduces stacked denoising auto-encoders (SdA) using Theano. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
3 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
4 Denoising autoencoders are the building blocks for SdA. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
5 They are based on auto-encoders as the ones used in Bengio et al. 2007. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
6 An autoencoder takes an input x and first maps it to a hidden representation |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
7 y = f_{\theta}(x) = s(Wx+b), parameterized by \theta={W,b}. The resulting |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
8 latent representation y is then mapped back to a "reconstructed" vector |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
9 z \in [0,1]^d in input space z = g_{\theta'}(y) = s(W'y + b'). The weight |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
10 matrix W' can optionally be constrained such that W' = W^T, in which case |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
11 the autoencoder is said to have tied weights. The network is trained such |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
12 that to minimize the reconstruction error (the error between x and z). |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
13 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
14 For the denosing autoencoder, during training, first x is corrupted into |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
15 \tilde{x}, where \tilde{x} is a partially destroyed version of x by means |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
16 of a stochastic mapping. Afterwards y is computed as before (using |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
17 \tilde{x}), y = s(W\tilde{x} + b) and z as s(W'y + b'). The reconstruction |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
18 error is now measured between z and the uncorrupted input x, which is |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
19 computed as the cross-entropy : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
20 - \sum_{k=1}^d[ x_k \log z_k + (1-x_k) \log( 1-z_k)] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
21 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
23 References : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
24 - P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol: Extracting and |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
25 Composing Robust Features with Denoising Autoencoders, ICML'08, 1096-1103, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
26 2008 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
27 - Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle: Greedy Layer-Wise |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
28 Training of Deep Networks, Advances in Neural Information Processing |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
29 Systems 19, 2007 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
30 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
31 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
32 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
33 import numpy, time, cPickle, gzip, sys, os |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
34 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
35 import theano |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
36 import theano.tensor as T |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
37 from theano.tensor.shared_randomstreams import RandomStreams |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
38 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
39 from logistic_sgd import load_data |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
40 from mlp import HiddenLayer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
41 from dA import dA |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
42 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
43 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
44 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
45 class BinaryLogisticRegressions(object): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
46 """Multiple 2-class Logistic Regressions Class |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
47 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
48 The logistic regressions are fully described by a weight matrix :math:`W` |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
49 and bias vector :math:`b`. Classification is done by projecting data |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
50 points onto a set of hyperplanes, the distance to which is used to |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
51 determine a class membership probability. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
52 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
53 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
54 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
55 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
56 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
57 def __init__(self, input, n_in, n_out): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
58 """ Initialize the parameters of the logistic regression |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
59 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
60 :type input: theano.tensor.TensorType |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
61 :param input: symbolic variable that describes the input of the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
62 architecture (one minibatch) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
63 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
64 :type n_in: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
65 :param n_in: number of input units, the dimension of the space in |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
66 which the datapoints lie |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
67 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
68 :type n_out: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
69 :param n_out: number of output units, the dimension of the space in |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
70 which the labels lie |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
71 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
72 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
73 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
74 # initialize with 0 the weights W as a matrix of shape (n_in, n_out) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
75 self.W = theano.shared(value=numpy.zeros((n_in,n_out), dtype = theano.config.floatX), |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
76 name='W') |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
77 # initialize the baises b as a vector of n_out 0s |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
78 self.b = theano.shared(value=numpy.zeros((n_out,), dtype = theano.config.floatX), |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
79 name='b') |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
80 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
81 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
82 # compute vector of class-membership probabilities in symbolic form |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
83 self.p_y_given_x = T.nnet.sigmoid(T.dot(input, self.W)+self.b) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
84 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
85 # compute prediction as class whose probability is maximal in |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
86 # symbolic form |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
87 self.y_pred=T.argmax(self.p_y_given_x, axis=1) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
88 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
89 # parameters of the model |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
90 self.params = [self.W, self.b] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
91 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
92 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
93 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
94 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
95 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
96 def negative_log_likelihood(self, y): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
97 """Return the mean of the negative log-likelihood of the prediction |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
98 of this model under a given target distribution. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
99 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
100 .. math:: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
101 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
102 \frac{1}{|\mathcal{D}|} \mathcal{L} (\theta=\{W,b\}, \mathcal{D}) = |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
103 \frac{1}{|\mathcal{D}|} \sum_{i=0}^{|\mathcal{D}|} \log(P(Y^{(i)}=y^{(i)}|x^{(i)}, W,b)) \\ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
104 \ell (\theta=\{W,b\}, \mathcal{D}) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
105 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
106 :type y: theano.tensor.TensorType |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
107 :param y: corresponds to a vector that gives for each example the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
108 correct label |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
109 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
110 Note: we use the mean instead of the sum so that |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
111 the learning rate is less dependent on the batch size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
112 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
113 return -T.mean(T.sum( y*T.log(self.p_y_given_x) + (1-y)*T.log(1-self.p_y_given_x), axis=1 ) ) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
114 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
115 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
116 def errors(self, y): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
117 """Return a float representing the number of errors in the minibatch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
118 over the total number of examples of the minibatch ; zero one |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
119 loss over the size of the minibatch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
120 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
121 :type y: theano.tensor.TensorType |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
122 :param y: corresponds to a vector that gives for each example the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
123 correct label |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
124 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
125 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
126 # check if y has same dimension of y_pred |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
127 if y.ndim != self.y_pred.ndim: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
128 raise TypeError('y should have the same shape as self.y_pred', |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
129 ('y', target.type, 'y_pred', self.y_pred.type)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
130 # check if y is of the correct datatype |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
131 if y.dtype.startswith('int'): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
132 # the T.neq operator returns a vector of 0s and 1s, where 1 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
133 # represents a mistake in prediction |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
134 return T.mean(T.neq(self.y_pred, y)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
135 else: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
136 raise NotImplementedError() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
137 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
138 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
139 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
140 class SdA(object): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
141 """Stacked denoising auto-encoder class (SdA) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
142 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
143 A stacked denoising autoencoder model is obtained by stacking several |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
144 dAs. The hidden layer of the dA at layer `i` becomes the input of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
145 the dA at layer `i+1`. The first layer dA gets as input the input of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
146 the SdA, and the hidden layer of the last dA represents the output. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
147 Note that after pretraining, the SdA is dealt with as a normal MLP, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
148 the dAs are only used to initialize the weights. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
149 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
150 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
151 def __init__(self, numpy_rng, theano_rng = None, n_ins = 784, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
152 hidden_layers_sizes = [500,500], n_outs = 10, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
153 corruption_levels = [0.1, 0.1]): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
154 """ This class is made to support a variable number of layers. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
155 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
156 :type numpy_rng: numpy.random.RandomState |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
157 :param numpy_rng: numpy random number generator used to draw initial |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
158 weights |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
159 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
160 :type theano_rng: theano.tensor.shared_randomstreams.RandomStreams |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
161 :param theano_rng: Theano random generator; if None is given one is |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
162 generated based on a seed drawn from `rng` |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
163 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
164 :type n_ins: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
165 :param n_ins: dimension of the input to the sdA |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
166 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
167 :type n_layers_sizes: list of ints |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
168 :param n_layers_sizes: intermidiate layers size, must contain |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
169 at least one value |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
170 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
171 :type n_outs: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
172 :param n_outs: dimension of the output of the network |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
173 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
174 :type corruption_levels: list of float |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
175 :param corruption_levels: amount of corruption to use for each |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
176 layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
177 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
178 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
179 self.sigmoid_layers = [] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
180 self.dA_layers = [] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
181 self.params = [] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
182 self.n_layers = len(hidden_layers_sizes) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
183 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
184 assert self.n_layers > 0 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
185 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
186 if not theano_rng: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
187 theano_rng = RandomStreams(numpy_rng.randint(2**30)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
188 # allocate symbolic variables for the data |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
189 self.x = T.matrix('x') # the data is presented as rasterized images |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
190 self.y = T.ivector('y') # the labels are presented as 1D vector of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
191 # [int] labels |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
192 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
193 # The SdA is an MLP, for which all weights of intermidiate layers |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
194 # are shared with a different denoising autoencoders |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
195 # We will first construct the SdA as a deep multilayer perceptron, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
196 # and when constructing each sigmoidal layer we also construct a |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
197 # denoising autoencoder that shares weights with that layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
198 # During pretraining we will train these autoencoders (which will |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
199 # lead to chainging the weights of the MLP as well) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
200 # During finetunining we will finish training the SdA by doing |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
201 # stochastich gradient descent on the MLP |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
202 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
203 for i in xrange( self.n_layers ): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
204 # construct the sigmoidal layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
205 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
206 # the size of the input is either the number of hidden units of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
207 # the layer below or the input size if we are on the first layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
208 if i == 0 : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
209 input_size = n_ins |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
210 else: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
211 input_size = hidden_layers_sizes[i-1] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
212 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
213 # the input to this layer is either the activation of the hidden |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
214 # layer below or the input of the SdA if you are on the first |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
215 # layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
216 if i == 0 : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
217 layer_input = self.x |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
218 else: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
219 layer_input = self.sigmoid_layers[-1].output |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
220 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
221 sigmoid_layer = HiddenLayer(rng = numpy_rng, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
222 input = layer_input, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
223 n_in = input_size, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
224 n_out = hidden_layers_sizes[i], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
225 activation = T.nnet.sigmoid) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
226 # add the layer to our list of layers |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
227 self.sigmoid_layers.append(sigmoid_layer) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
228 # its arguably a philosophical question... |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
229 # but we are going to only declare that the parameters of the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
230 # sigmoid_layers are parameters of the StackedDAA |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
231 # the visible biases in the dA are parameters of those |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
232 # dA, but not the SdA |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
233 self.params.extend(sigmoid_layer.params) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
234 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
235 # Construct a denoising autoencoder that shared weights with this |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
236 # layer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
237 dA_layer = dA(numpy_rng = numpy_rng, theano_rng = theano_rng, input = layer_input, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
238 n_visible = input_size, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
239 n_hidden = hidden_layers_sizes[i], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
240 W = sigmoid_layer.W, bhid = sigmoid_layer.b) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
241 self.dA_layers.append(dA_layer) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
242 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
243 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
244 # We now need to add a logistic layer on top of the MLP |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
245 #self.logLayer = LogisticRegression(\ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
246 # input = self.sigmoid_layers[-1].output,\ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
247 # n_in = hidden_layers_sizes[-1], n_out = n_outs) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
248 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
249 self.logLayer = BinaryLogisticRegressions(\ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
250 input = self.sigmoid_layers[-1].output,\ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
251 n_in = hidden_layers_sizes[-1], n_out = n_outs) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
252 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
253 self.params.extend(self.logLayer.params) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
254 # construct a function that implements one step of finetunining |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
255 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
256 # compute the cost for second phase of training, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
257 # defined as the negative log likelihood |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
258 #self.finetune_cost = self.logLayer.negative_log_likelihood(self.y) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
259 self.finetune_cost = self.logLayer.negative_log_likelihood(self.y) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
260 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
261 # compute the gradients with respect to the model parameters |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
262 # symbolic variable that points to the number of errors made on the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
263 # minibatch given by self.x and self.y |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
264 self.errors = self.logLayer.errors(self.y) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
265 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
266 def pretraining_functions(self, train_set_x, batch_size): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
267 ''' Generates a list of functions, each of them implementing one |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
268 step in trainnig the dA corresponding to the layer with same index. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
269 The function will require as input the minibatch index, and to train |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
270 a dA you just need to iterate, calling the corresponding function on |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
271 all minibatch indexes. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
272 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
273 :type train_set_x: theano.tensor.TensorType |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
274 :param train_set_x: Shared variable that contains all datapoints used |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
275 for training the dA |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
276 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
277 :type batch_size: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
278 :param batch_size: size of a [mini]batch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
279 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
280 :type learning_rate: float |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
281 :param learning_rate: learning rate used during training for any of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
282 the dA layers |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
283 ''' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
284 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
285 # index to a [mini]batch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
286 index = T.lscalar('index') # index to a minibatch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
287 corruption_level = T.scalar('corruption') # amount of corruption to use |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
288 learning_rate = T.scalar('lr') # learning rate to use |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
289 # number of batches |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
290 n_batches = train_set_x.value.shape[0] / batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
291 # begining of a batch, given `index` |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
292 batch_begin = index * batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
293 # ending of a batch given `index` |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
294 batch_end = batch_begin+batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
295 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
296 pretrain_fns = [] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
297 for dA in self.dA_layers: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
298 # get the cost and the updates list |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
299 cost,updates = dA.get_cost_updates( corruption_level, learning_rate) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
300 # compile the theano function |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
301 fn = theano.function( inputs = [index, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
302 theano.Param(corruption_level, default = 0.2), |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
303 theano.Param(learning_rate, default = 0.1)], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
304 outputs = cost, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
305 updates = updates, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
306 givens = {self.x :train_set_x[batch_begin:batch_end]}) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
307 # append `fn` to the list of functions |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
308 pretrain_fns.append(fn) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
309 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
310 return pretrain_fns |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
311 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
312 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
313 def build_finetune_functions(self, datasets, batch_size, learning_rate): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
314 '''Generates a function `train` that implements one step of |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
315 finetuning, a function `validate` that computes the error on |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
316 a batch from the validation set, and a function `test` that |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
317 computes the error on a batch from the testing set |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
318 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
319 :type datasets: list of pairs of theano.tensor.TensorType |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
320 :param datasets: It is a list that contain all the datasets; |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
321 the has to contain three pairs, `train`, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
322 `valid`, `test` in this order, where each pair |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
323 is formed of two Theano variables, one for the |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
324 datapoints, the other for the labels |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
325 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
326 :type batch_size: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
327 :param batch_size: size of a minibatch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
328 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
329 :type learning_rate: float |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
330 :param learning_rate: learning rate used during finetune stage |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
331 ''' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
332 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
333 (train_set_x, train_set_y) = datasets[0] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
334 (valid_set_x, valid_set_y) = datasets[1] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
335 (test_set_x , test_set_y ) = datasets[2] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
336 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
337 # compute number of minibatches for training, validation and testing |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
338 n_valid_batches = valid_set_x.value.shape[0] / batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
339 n_test_batches = test_set_x.value.shape[0] / batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
340 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
341 index = T.lscalar('index') # index to a [mini]batch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
342 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
343 # compute the gradients with respect to the model parameters |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
344 gparams = T.grad(self.finetune_cost, self.params) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
345 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
346 # compute list of fine-tuning updates |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
347 updates = {} |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
348 for param, gparam in zip(self.params, gparams): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
349 updates[param] = param - gparam*learning_rate |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
350 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
351 train_fn = theano.function(inputs = [index], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
352 outputs = self.finetune_cost, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
353 updates = updates, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
354 givens = { |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
355 self.x : train_set_x[index*batch_size:(index+1)*batch_size], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
356 self.y : train_set_y[index*batch_size:(index+1)*batch_size]}) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
357 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
358 test_score_i = theano.function([index], self.errors, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
359 givens = { |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
360 self.x: test_set_x[index*batch_size:(index+1)*batch_size], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
361 self.y: test_set_y[index*batch_size:(index+1)*batch_size]}) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
362 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
363 valid_score_i = theano.function([index], self.errors, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
364 givens = { |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
365 self.x: valid_set_x[index*batch_size:(index+1)*batch_size], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
366 self.y: valid_set_y[index*batch_size:(index+1)*batch_size]}) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
367 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
368 # Create a function that scans the entire validation set |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
369 def valid_score(): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
370 return [valid_score_i(i) for i in xrange(n_valid_batches)] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
371 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
372 # Create a function that scans the entire test set |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
373 def test_score(): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
374 return [test_score_i(i) for i in xrange(n_test_batches)] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
375 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
376 return train_fn, valid_score, test_score |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
377 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
378 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
379 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
380 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
381 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
382 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
383 def test_SdA( finetune_lr = 0.1, pretraining_epochs = 15, \ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
384 pretrain_lr = 0.05, training_epochs = 1000, \ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
385 dataset='../data/mnist.pkl.gz', batch_size = 1): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
386 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
387 Demonstrates how to train and test a stochastic denoising autoencoder. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
388 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
389 This is demonstrated on MNIST. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
390 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
391 :type learning_rate: float |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
392 :param learning_rate: learning rate used in the finetune stage |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
393 (factor for the stochastic gradient) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
394 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
395 :type pretraining_epochs: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
396 :param pretraining_epochs: number of epoch to do pretraining |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
397 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
398 :type pretrain_lr: float |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
399 :param pretrain_lr: learning rate to be used during pre-training |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
400 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
401 :type n_iter: int |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
402 :param n_iter: maximal number of iterations ot run the optimizer |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
403 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
404 :type dataset: string |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
405 :param dataset: path the the pickled dataset |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
406 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
407 """ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
408 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
409 datasets = load_data(dataset) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
410 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
411 train_set_x, train_set_y = datasets[0] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
412 valid_set_x, valid_set_y = datasets[1] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
413 test_set_x , test_set_y = datasets[2] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
414 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
415 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
416 # compute number of minibatches for training, validation and testing |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
417 n_train_batches = train_set_x.value.shape[0] / batch_size |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
418 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
419 # numpy random generator |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
420 numpy_rng = numpy.random.RandomState(123) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
421 print '... building the model' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
422 # construct the stacked denoising autoencoder class |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
423 sda = SdA( numpy_rng = numpy_rng, n_ins = 28*28, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
424 hidden_layers_sizes = [1000,1000,1000], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
425 n_outs = 10) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
426 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
427 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
428 ######################### |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
429 # PRETRAINING THE MODEL # |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
430 ######################### |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
431 print '... getting the pretraining functions' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
432 pretraining_fns = sda.pretraining_functions( |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
433 train_set_x = train_set_x, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
434 batch_size = batch_size ) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
435 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
436 print '... pre-training the model' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
437 start_time = time.clock() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
438 ## Pre-train layer-wise |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
439 corruption_levels = [.1,.1,.0] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
440 for i in xrange(sda.n_layers): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
441 # go through pretraining epochs |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
442 for epoch in xrange(pretraining_epochs): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
443 # go through the training set |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
444 c = [] |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
445 for batch_index in xrange(n_train_batches): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
446 c.append( pretraining_fns[i](index = batch_index, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
447 corruption = corruption_levels[i], |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
448 lr = pretrain_lr ) ) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
449 print 'Pre-training layer %i, epoch %d, cost '%(i,epoch),numpy.mean(c) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
450 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
451 end_time = time.clock() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
452 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
453 print >> sys.stderr, ('The pretraining code for file '+os.path.split(__file__)[1]+' ran for %.2fm expected 4.58m in our buildbot' % ((end_time-start_time)/60.)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
454 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
455 ######################## |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
456 # FINETUNING THE MODEL # |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
457 ######################## |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
458 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
459 # get the training, validation and testing function for the model |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
460 print '... getting the finetuning functions' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
461 train_fn, validate_model, test_model = sda.build_finetune_functions ( |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
462 datasets = datasets, batch_size = batch_size, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
463 learning_rate = finetune_lr) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
464 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
465 print '... finetunning the model' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
466 # early-stopping parameters |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
467 patience = 10*n_train_batches # look as this many examples regardless |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
468 patience_increase = 2. # wait this much longer when a new best is |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
469 # found |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
470 improvement_threshold = 0.995 # a relative improvement of this much is |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
471 # considered significant |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
472 validation_frequency = min(n_train_batches, patience/2) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
473 # go through this many |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
474 # minibatche before checking the network |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
475 # on the validation set; in this case we |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
476 # check every epoch |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
477 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
478 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
479 best_params = None |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
480 best_validation_loss = float('inf') |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
481 test_score = 0. |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
482 start_time = time.clock() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
483 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
484 done_looping = False |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
485 epoch = 0 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
486 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
487 while (epoch < training_epochs) and (not done_looping): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
488 for minibatch_index in xrange(n_train_batches): |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
489 minibatch_avg_cost = train_fn(minibatch_index) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
490 iter = epoch * n_train_batches + minibatch_index |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
491 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
492 if (iter+1) % validation_frequency == 0: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
493 validation_losses = validate_model() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
494 this_validation_loss = numpy.mean(validation_losses) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
495 print('epoch %i, minibatch %i/%i, validation error %f %%' % \ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
496 (epoch, minibatch_index+1, n_train_batches, \ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
497 this_validation_loss*100.)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
498 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
499 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
500 # if we got the best validation score until now |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
501 if this_validation_loss < best_validation_loss: |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
502 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
503 #improve patience if loss improvement is good enough |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
504 if this_validation_loss < best_validation_loss * \ |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
505 improvement_threshold : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
506 patience = max(patience, iter * patience_increase) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
507 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
508 # save best validation score and iteration number |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
509 best_validation_loss = this_validation_loss |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
510 best_iter = iter |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
511 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
512 # test it on the test set |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
513 test_losses = test_model() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
514 test_score = numpy.mean(test_losses) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
515 print((' epoch %i, minibatch %i/%i, test error of best ' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
516 'model %f %%') % |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
517 (epoch, minibatch_index+1, n_train_batches, |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
518 test_score*100.)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
519 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
520 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
521 if patience <= iter : |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
522 done_looping = True |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
523 break |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
524 epoch = epoch + 1 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
525 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
526 end_time = time.clock() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
527 print(('Optimization complete with best validation score of %f %%,' |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
528 'with test performance %f %%') % |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
529 (best_validation_loss * 100., test_score*100.)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
530 print >> sys.stderr, ('The training code for file '+os.path.split(__file__)[1]+' ran for %.2fm expected 3.91m in our buildbot' % ((end_time-start_time)/60.)) |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
531 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
532 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
533 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
534 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
535 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
536 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
537 if __name__ == '__main__': |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
538 test_SdA() |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
539 |
daa355332b66
added sigmoid_output_SdA.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff
changeset
|
540 |