annotate baseline/mlp/mlp_nist.py @ 346:7bc555cc9aab

Ajouté dans set_batches : choix de la classe principale
author Guillaume Sicard <guitch21@gmail.com>
date Mon, 19 Apr 2010 07:09:44 -0400
parents fca22114bb23
children 22efb4968054
rev   line source
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
1 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
2 This tutorial introduces the multilayer perceptron using Theano.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
3
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
4 A multilayer perceptron is a logistic regressor where
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
5 instead of feeding the input to the logistic regression you insert a
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
6 intermidiate layer, called the hidden layer, that has a nonlinear
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
7 activation function (usually tanh or sigmoid) . One can use many such
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
8 hidden layers making the architecture deep. The tutorial will also tackle
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
9 the problem of MNIST digit classification.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
10
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
11 .. math::
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
12
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
13 f(x) = G( b^{(2)} + W^{(2)}( s( b^{(1)} + W^{(1)} x))),
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
14
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
15 References:
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
16
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
17 - textbooks: "Pattern Recognition and Machine Learning" -
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
18 Christopher M. Bishop, section 5
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
19
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
20 TODO: recommended preprocessing, lr ranges, regularization ranges (explain
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
21 to do lr first, then add regularization)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
22
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
23 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
24 __docformat__ = 'restructedtext en'
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
25
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
26 import pdb
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
27 import numpy
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
28 import pylab
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
29 import theano
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
30 import theano.tensor as T
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
31 import time
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
32 import theano.tensor.nnet
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
33 import pylearn
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
34 import theano,pylearn.version,ift6266
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
35 from pylearn.io import filetensor as ft
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
36 from ift6266 import datasets
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
37
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
38 data_path = '/data/lisa/data/nist/by_class/'
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
39
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
40 class MLP(object):
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
41 """Multi-Layer Perceptron Class
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
42
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
43 A multilayer perceptron is a feedforward artificial neural network model
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
44 that has one layer or more of hidden units and nonlinear activations.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
45 Intermidiate layers usually have as activation function thanh or the
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
46 sigmoid function while the top layer is a softamx layer.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
47 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
48
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
49
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
50
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
51 def __init__(self, input, n_in, n_hidden, n_out,learning_rate):
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
52 """Initialize the parameters for the multilayer perceptron
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
53
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
54 :param input: symbolic variable that describes the input of the
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
55 architecture (one minibatch)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
56
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
57 :param n_in: number of input units, the dimension of the space in
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
58 which the datapoints lie
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
59
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
60 :param n_hidden: number of hidden units
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
61
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
62 :param n_out: number of output units, the dimension of the space in
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
63 which the labels lie
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
64
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
65 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
66
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
67 # initialize the parameters theta = (W1,b1,W2,b2) ; note that this
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
68 # example contains only one hidden layer, but one can have as many
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
69 # layers as he/she wishes, making the network deeper. The only
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
70 # problem making the network deep this way is during learning,
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
71 # backpropagation being unable to move the network from the starting
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
72 # point towards; this is where pre-training helps, giving a good
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
73 # starting point for backpropagation, but more about this in the
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
74 # other tutorials
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
75
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
76 # `W1` is initialized with `W1_values` which is uniformely sampled
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
77 # from -6./sqrt(n_in+n_hidden) and 6./sqrt(n_in+n_hidden)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
78 # the output of uniform if converted using asarray to dtype
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
79 # theano.config.floatX so that the code is runable on GPU
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
80 W1_values = numpy.asarray( numpy.random.uniform( \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
81 low = -numpy.sqrt(6./(n_in+n_hidden)), \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
82 high = numpy.sqrt(6./(n_in+n_hidden)), \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
83 size = (n_in, n_hidden)), dtype = theano.config.floatX)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
84 # `W2` is initialized with `W2_values` which is uniformely sampled
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
85 # from -6./sqrt(n_hidden+n_out) and 6./sqrt(n_hidden+n_out)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
86 # the output of uniform if converted using asarray to dtype
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
87 # theano.config.floatX so that the code is runable on GPU
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
88 W2_values = numpy.asarray( numpy.random.uniform(
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
89 low = -numpy.sqrt(6./(n_hidden+n_out)), \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
90 high= numpy.sqrt(6./(n_hidden+n_out)),\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
91 size= (n_hidden, n_out)), dtype = theano.config.floatX)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
92
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
93 self.W1 = theano.shared( value = W1_values )
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
94 self.b1 = theano.shared( value = numpy.zeros((n_hidden,),
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
95 dtype= theano.config.floatX))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
96 self.W2 = theano.shared( value = W2_values )
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
97 self.b2 = theano.shared( value = numpy.zeros((n_out,),
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
98 dtype= theano.config.floatX))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
99
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
100 #include the learning rate in the classifer so
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
101 #we can modify it on the fly when we want
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
102 lr_value=learning_rate
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
103 self.lr=theano.shared(value=lr_value)
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
104 # symbolic expression computing the values of the hidden layer
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
105 self.hidden = T.tanh(T.dot(input, self.W1)+ self.b1)
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
106
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
107
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
108
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
109 # symbolic expression computing the values of the top layer
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
110 self.p_y_given_x= T.nnet.softmax(T.dot(self.hidden, self.W2)+self.b2)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
111
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
112 # compute prediction as class whose probability is maximal in
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
113 # symbolic form
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
114 self.y_pred = T.argmax( self.p_y_given_x, axis =1)
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
115 self.y_pred_num = T.argmax( self.p_y_given_x[0:9], axis =1)
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
116
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
117
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
118
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
119
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
120 # L1 norm ; one regularization option is to enforce L1 norm to
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
121 # be small
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
122 self.L1 = abs(self.W1).sum() + abs(self.W2).sum()
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
123
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
124 # square of L2 norm ; one regularization option is to enforce
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
125 # square of L2 norm to be small
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
126 self.L2_sqr = (self.W1**2).sum() + (self.W2**2).sum()
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
127
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
128
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
129
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
130 def negative_log_likelihood(self, y):
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
131 """Return the mean of the negative log-likelihood of the prediction
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
132 of this model under a given target distribution.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
133
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
134 .. math::
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
135
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
136 \frac{1}{|\mathcal{D}|}\mathcal{L} (\theta=\{W,b\}, \mathcal{D}) =
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
137 \frac{1}{|\mathcal{D}|}\sum_{i=0}^{|\mathcal{D}|} \log(P(Y=y^{(i)}|x^{(i)}, W,b)) \\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
138 \ell (\theta=\{W,b\}, \mathcal{D})
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
139
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
140
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
141 :param y: corresponds to a vector that gives for each example the
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
142 :correct label
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
143 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
144 return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]),y])
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
145
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
146
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
147
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
148
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
149 def errors(self, y):
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
150 """Return a float representing the number of errors in the minibatch
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
151 over the total number of examples of the minibatch
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
152 """
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
153
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
154 # check if y has same dimension of y_pred
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
155 if y.ndim != self.y_pred.ndim:
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
156 raise TypeError('y should have the same shape as self.y_pred',
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
157 ('y', target.type, 'y_pred', self.y_pred.type))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
158 # check if y is of the correct datatype
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
159 if y.dtype.startswith('int'):
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
160 # the T.neq operator returns a vector of 0s and 1s, where 1
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
161 # represents a mistake in prediction
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
162 return T.mean(T.neq(self.y_pred, y))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
163 else:
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
164 raise NotImplementedError()
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
165
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
166 def mlp_get_nist_error(model_name='/u/mullerx/ift6266h10_sandbox_db/xvm_final_lr1_p073/8/best_model.npy.npz',
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
167 data_set=0):
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
168
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
169
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
170
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
171 # allocate symbolic variables for the data
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
172 x = T.fmatrix() # the data is presented as rasterized images
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
173 y = T.lvector() # the labels are presented as 1D vector of
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
174 # [long int] labels
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
175
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
176 # load the data set and create an mlp based on the dimensions of the model
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
177 model=numpy.load(model_name)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
178 W1=model['W1']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
179 W2=model['W2']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
180 b1=model['b1']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
181 b2=model['b2']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
182 nb_hidden=b1.shape[0]
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
183 input_dim=W1.shape[0]
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
184 nb_targets=b2.shape[0]
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
185 learning_rate=0.1
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
186
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
187
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
188 if data_set==0:
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
189 dataset=datasets.nist_all()
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
190 elif data_set==1:
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
191 dataset=datasets.nist_P07()
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
192
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
193
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
194 classifier = MLP( input=x,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
195 n_in=input_dim,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
196 n_hidden=nb_hidden,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
197 n_out=nb_targets,
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
198 learning_rate=learning_rate)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
199
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
200
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
201 #overwrite weights with weigths from model
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
202 classifier.W1.value=W1
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
203 classifier.W2.value=W2
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
204 classifier.b1.value=b1
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
205 classifier.b2.value=b2
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
206
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
207
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
208 cost = classifier.negative_log_likelihood(y) \
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
209 + 0.0 * classifier.L1 \
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
210 + 0.0 * classifier.L2_sqr
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
211
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
212 # compiling a theano function that computes the mistakes that are made by
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
213 # the model on a minibatch
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
214 test_model = theano.function([x,y], classifier.errors(y))
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
215
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
216
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
217
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
218 #get the test error
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
219 #use a batch size of 1 so we can get the sub-class error
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
220 #without messing with matrices (will be upgraded later)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
221 test_score=0
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
222 temp=0
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
223 for xt,yt in dataset.test(20):
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
224 test_score += test_model(xt,yt)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
225 temp = temp+1
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
226 test_score /= temp
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
227
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
228
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
229 return test_score*100
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
230
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
231
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
232
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
233
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
234
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
235
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
236 def mlp_full_nist( verbose = 1,\
145
8ceaaf812891 changed adaptive lr flag from bool to int for jobman issues
XavierMuller
parents: 143
diff changeset
237 adaptive_lr = 0,\
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
238 data_set=0,\
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
239 learning_rate=0.01,\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
240 L1_reg = 0.00,\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
241 L2_reg = 0.0001,\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
242 nb_max_exemples=1000000,\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
243 batch_size=20,\
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
244 nb_hidden = 30,\
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
245 nb_targets = 62,
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
246 tau=1e6,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
247 lr_t2_factor=0.5,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
248 init_model=0,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
249 channel=0):
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
250
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
251
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
252 if channel!=0:
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
253 channel.save()
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
254 configuration = [learning_rate,nb_max_exemples,nb_hidden,adaptive_lr]
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
255
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
256 #save initial learning rate if classical adaptive lr is used
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
257 initial_lr=learning_rate
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
258 max_div_count=1000
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
259
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
260
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
261 total_validation_error_list = []
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
262 total_train_error_list = []
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
263 learning_rate_list=[]
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
264 best_training_error=float('inf');
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
265 divergence_flag_list=[]
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
266
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
267 if data_set==0:
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
268 dataset=datasets.nist_all()
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
269 elif data_set==1:
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
270 dataset=datasets.nist_P07()
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
271
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
272
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
273
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
274
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
275 ishape = (32,32) # this is the size of NIST images
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
276
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
277 # allocate symbolic variables for the data
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
278 x = T.fmatrix() # the data is presented as rasterized images
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
279 y = T.lvector() # the labels are presented as 1D vector of
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
280 # [long int] labels
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
281
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
282
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
283 # construct the logistic regression class
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
284 classifier = MLP( input=x,\
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
285 n_in=32*32,\
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
286 n_hidden=nb_hidden,\
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
287 n_out=nb_targets,
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
288 learning_rate=learning_rate)
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
289
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
290
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
291 # check if we want to initialise the weights with a previously calculated model
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
292 # dimensions must be consistent between old model and current configuration!!!!!! (nb_hidden and nb_targets)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
293 if init_model!=0:
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
294 old_model=numpy.load(init_model)
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
295 classifier.W1.value=old_model['W1']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
296 classifier.W2.value=old_model['W2']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
297 classifier.b1.value=old_model['b1']
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
298 classifier.b2.value=old_model['b2']
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
299
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
300
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
301 # the cost we minimize during training is the negative log likelihood of
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
302 # the model plus the regularization terms (L1 and L2); cost is expressed
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
303 # here symbolically
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
304 cost = classifier.negative_log_likelihood(y) \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
305 + L1_reg * classifier.L1 \
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
306 + L2_reg * classifier.L2_sqr
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
307
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
308 # compiling a theano function that computes the mistakes that are made by
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
309 # the model on a minibatch
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
310 test_model = theano.function([x,y], classifier.errors(y))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
311
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
312 # compute the gradient of cost with respect to theta = (W1, b1, W2, b2)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
313 g_W1 = T.grad(cost, classifier.W1)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
314 g_b1 = T.grad(cost, classifier.b1)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
315 g_W2 = T.grad(cost, classifier.W2)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
316 g_b2 = T.grad(cost, classifier.b2)
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
317
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
318 # specify how to update the parameters of the model as a dictionary
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
319 updates = \
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
320 { classifier.W1: classifier.W1 - classifier.lr*g_W1 \
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
321 , classifier.b1: classifier.b1 - classifier.lr*g_b1 \
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
322 , classifier.W2: classifier.W2 - classifier.lr*g_W2 \
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
323 , classifier.b2: classifier.b2 - classifier.lr*g_b2 }
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
324
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
325 # compiling a theano function `train_model` that returns the cost, but in
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
326 # the same time updates the parameter of the model based on the rules
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
327 # defined in `updates`
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
328 train_model = theano.function([x, y], cost, updates = updates )
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
329
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
330
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
331
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
332
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
333
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
334
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
335
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
336
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
337
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
338 #conditions for stopping the adaptation:
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
339 #1) we have reached nb_max_exemples (this is rounded up to be a multiple of the train size so we always do at least 1 epoch)
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
340 #2) validation error is going up twice in a row(probable overfitting)
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
341
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
342 # This means we no longer stop on slow convergence as low learning rates stopped
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
343 # too fast but instead we will wait for the valid error going up 3 times in a row
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
344 # We save the curb of the validation error so we can always go back to check on it
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
345 # and we save the absolute best model anyway, so we might as well explore
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
346 # a bit when diverging
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
347
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
348 #approximate number of samples in the nist training set
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
349 #this is just to have a validation frequency
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
350 #roughly proportionnal to the original nist training set
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
351 n_minibatches = 650000/batch_size
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
352
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
353
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
354 patience =2*nb_max_exemples/batch_size #in units of minibatch
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
355 validation_frequency = n_minibatches/4
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
356
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
357
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
358
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
359
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
360
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
361 best_validation_loss = float('inf')
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
362 best_iter = 0
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
363 test_score = 0.
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
364 start_time = time.clock()
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
365 time_n=0 #in unit of exemples
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
366 minibatch_index=0
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
367 epoch=0
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
368 temp=0
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
369 divergence_flag=0
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
370
212
e390b0454515 added classic lr time decay and py code to calculate the error based on a saved model
xaviermuller
parents: 169
diff changeset
371
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
372
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
373 if verbose == 1:
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
374 print 'starting training'
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
375 while(minibatch_index*batch_size<nb_max_exemples):
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
376
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
377 for x, y in dataset.train(batch_size):
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
378
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
379 #if we are using the classic learning rate deacay, adjust it before training of current mini-batch
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
380 if adaptive_lr==2:
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
381 classifier.lr.value = tau*initial_lr/(tau+time_n)
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
382
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
383
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
384 #train model
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
385 cost_ij = train_model(x,y)
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
386
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
387 if (minibatch_index) % validation_frequency == 0:
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
388 #save the current learning rate
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
389 learning_rate_list.append(classifier.lr.value)
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
390 divergence_flag_list.append(divergence_flag)
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
391
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
392 #save temp results to check during training
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
393 numpy.savez('temp_results.npy',config=configuration,total_validation_error_list=total_validation_error_list,\
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
394 learning_rate_list=learning_rate_list, divergence_flag_list=divergence_flag_list)
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
395
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
396 # compute the validation error
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
397 this_validation_loss = 0.
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
398 temp=0
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
399 for xv,yv in dataset.valid(1):
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
400 # sum up the errors for each minibatch
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
401 this_validation_loss += test_model(xv,yv)
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
402 temp=temp+1
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
403 # get the average by dividing with the number of minibatches
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
404 this_validation_loss /= temp
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
405 #save the validation loss
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
406 total_validation_error_list.append(this_validation_loss)
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
407 if verbose == 1:
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
408 print(('epoch %i, minibatch %i, learning rate %f current validation error %f ') %
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
409 (epoch, minibatch_index+1,classifier.lr.value,
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
410 this_validation_loss*100.))
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
411
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
412 # if we got the best validation score until now
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
413 if this_validation_loss < best_validation_loss:
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
414 # save best validation score and iteration number
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
415 best_validation_loss = this_validation_loss
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
416 best_iter = minibatch_index
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
417 #reset divergence flag
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
418 divergence_flag=0
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
419
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
420 #save the best model. Overwrite the current saved best model so
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
421 #we only keep the best
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
422 numpy.savez('best_model.npy', config=configuration, W1=classifier.W1.value, W2=classifier.W2.value, b1=classifier.b1.value,\
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
423 b2=classifier.b2.value, minibatch_index=minibatch_index)
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
424
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
425 # test it on the test set
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
426 test_score = 0.
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
427 temp =0
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
428 for xt,yt in dataset.test(batch_size):
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
429 test_score += test_model(xt,yt)
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
430 temp = temp+1
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
431 test_score /= temp
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
432 if verbose == 1:
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
433 print(('epoch %i, minibatch %i, test error of best '
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
434 'model %f %%') %
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
435 (epoch, minibatch_index+1,
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
436 test_score*100.))
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
437
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
438 # if the validation error is going up, we are overfitting (or oscillating)
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
439 # check if we are allowed to continue and if we will adjust the learning rate
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
440 elif this_validation_loss >= best_validation_loss:
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
441
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
442
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
443 # In non-classic learning rate decay, we modify the weight only when
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
444 # validation error is going up
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
445 if adaptive_lr==1:
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
446 classifier.lr.value=classifier.lr.value*lr_t2_factor
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
447
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
448
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
449 #cap the patience so we are allowed to diverge max_div_count times
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
450 #if we are going up max_div_count in a row, we will stop immediatelty by modifying the patience
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
451 divergence_flag = divergence_flag +1
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
452
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
453
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
454 #calculate the test error at this point and exit
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
455 # test it on the test set
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
456 test_score = 0.
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
457 temp=0
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
458 for xt,yt in dataset.test(batch_size):
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
459 test_score += test_model(xt,yt)
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
460 temp=temp+1
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
461 test_score /= temp
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
462 if verbose == 1:
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
463 print ' validation error is going up, possibly stopping soon'
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
464 print((' epoch %i, minibatch %i, test error of best '
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
465 'model %f %%') %
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
466 (epoch, minibatch_index+1,
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
467 test_score*100.))
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
468
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
469
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
470
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
471 # check early stop condition
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
472 if divergence_flag==max_div_count:
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
473 minibatch_index=nb_max_exemples
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
474 print 'we have diverged, early stopping kicks in'
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
475 break
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
476
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
477 #check if we have seen enough exemples
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
478 #force one epoch at least
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
479 if epoch>0 and minibatch_index*batch_size>nb_max_exemples:
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
480 break
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
481
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
482
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
483
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
484
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
485
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
486 time_n= time_n + batch_size
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
487 minibatch_index = minibatch_index + 1
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
488
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
489 # we have finished looping through the training set
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
490 epoch = epoch+1
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
491 end_time = time.clock()
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
492 if verbose == 1:
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
493 print(('Optimization complete. Best validation score of %f %% '
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
494 'obtained at iteration %i, with test performance %f %%') %
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
495 (best_validation_loss * 100., best_iter, test_score*100.))
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
496 print ('The code ran for %f minutes' % ((end_time-start_time)/60.))
322
743907366476 code clean up in progress
xaviermuller
parents: 304
diff changeset
497 print minibatch_index
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
498
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
499 #save the model and the weights
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
500 numpy.savez('model.npy', config=configuration, W1=classifier.W1.value,W2=classifier.W2.value, b1=classifier.b1.value,b2=classifier.b2.value)
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
501 numpy.savez('results.npy',config=configuration,total_train_error_list=total_train_error_list,total_validation_error_list=total_validation_error_list,\
323
7a7615f940e8 finished code clean up and testing
xaviermuller
parents: 322
diff changeset
502 learning_rate_list=learning_rate_list, divergence_flag_list=divergence_flag_list)
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
503
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
504 return (best_training_error*100.0,best_validation_loss * 100.,test_score*100.,best_iter*batch_size,(end_time-start_time)/60)
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
505
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
506
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
507 if __name__ == '__main__':
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
508 mlp_full_mnist()
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
509
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
510 def jobman_mlp_full_nist(state,channel):
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
511 (train_error,validation_error,test_error,nb_exemples,time)=mlp_full_nist(learning_rate=state.learning_rate,\
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
512 nb_max_exemples=state.nb_max_exemples,\
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
513 nb_hidden=state.nb_hidden,\
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
514 adaptive_lr=state.adaptive_lr,\
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
515 tau=state.tau,\
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
516 verbose = state.verbose,\
324
1763c64030d1 fixed bug in jobman interface
xaviermuller
parents: 323
diff changeset
517 lr_t2_factor=state.lr_t2_factor,
338
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
518 data_set=state.data_set,
fca22114bb23 added async save, restart from old model and independant error calculation based on Arnaud's iterator
xaviermuller
parents: 324
diff changeset
519 channel=channel)
143
f341a4efb44a added adaptive lr, weight file save, traine error and error curves
XavierMuller
parents: 110
diff changeset
520 state.train_error=train_error
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
521 state.validation_error=validation_error
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
522 state.test_error=test_error
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
523 state.nb_exemples=nb_exemples
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
524 state.time=time
304
1e4bf5a5b46d added type 2 adaptive learning configurable learning weight + versionning
xaviermuller
parents: 237
diff changeset
525 pylearn.version.record_versions(state,[theano,ift6266,pylearn])
110
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
526 return channel.COMPLETE
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
527
93b4b84d86cf added simple mlp file
XavierMuller
parents:
diff changeset
528