annotate code_tutoriel/mlp.py @ 239:42005ec87747

Mergé (manuellement) les changements de Sylvain pour utiliser le code de dataset d'Arnaud, à cette différence près que je n'utilse pas les givens. J'ai probablement une approche différente pour limiter la taille du dataset dans mon débuggage, aussi.
author fsavard
date Mon, 15 Mar 2010 18:30:21 -0400
parents 4bc5eeec6394
children
rev   line source
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
1 """
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
2 This tutorial introduces the multilayer perceptron using Theano.
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
3
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
4 A multilayer perceptron is a logistic regressor where
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
5 instead of feeding the input to the logistic regression you insert a
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
6 intermidiate layer, called the hidden layer, that has a nonlinear
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
7 activation function (usually tanh or sigmoid) . One can use many such
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
8 hidden layers making the architecture deep. The tutorial will also tackle
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
9 the problem of MNIST digit classification.
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
10
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
11 .. math::
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
12
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
13 f(x) = G( b^{(2)} + W^{(2)}( s( b^{(1)} + W^{(1)} x))),
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
14
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
15 References:
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
16
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
17 - textbooks: "Pattern Recognition and Machine Learning" -
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
18 Christopher M. Bishop, section 5
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
19
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
20 """
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
21 __docformat__ = 'restructedtext en'
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
22
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
23
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
24 import numpy, time, cPickle, gzip
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
25
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
26 import theano
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
27 import theano.tensor as T
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
28
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
29
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
30 from logistic_sgd import LogisticRegression, load_data
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
31
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
32
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
33 class HiddenLayer(object):
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
34 def __init__(self, rng, input, n_in, n_out, activation = T.tanh):
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
35 """
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
36 Typical hidden layer of a MLP: units are fully-connected and have
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
37 sigmoidal activation function. Weight matrix W is of shape (n_in,n_out)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
38 and the bias vector b is of shape (n_out,).
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
39
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
40 NOTE : The nonlinearity used here is tanh
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
41
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
42 Hidden unit activation is given by: tanh(dot(input,W) + b)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
43
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
44 :type rng: numpy.random.RandomState
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
45 :param rng: a random number generator used to initialize weights
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
46
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
47 :type input: theano.tensor.dmatrix
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
48 :param input: a symbolic tensor of shape (n_examples, n_in)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
49
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
50 :type n_in: int
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
51 :param n_in: dimensionality of input
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
52
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
53 :type n_out: int
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
54 :param n_out: number of hidden units
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
55
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
56 :type activation: theano.Op or function
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
57 :param activation: Non linearity to be applied in the hidden
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
58 layer
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
59 """
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
60 self.input = input
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
61
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
62 # `W` is initialized with `W_values` which is uniformely sampled
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
63 # from -6./sqrt(n_in+n_hidden) and 6./sqrt(n_in+n_hidden)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
64 # the output of uniform if converted using asarray to dtype
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
65 # theano.config.floatX so that the code is runable on GPU
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
66 W_values = numpy.asarray( rng.uniform( \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
67 low = -numpy.sqrt(6./(n_in+n_out)), \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
68 high = numpy.sqrt(6./(n_in+n_out)), \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
69 size = (n_in, n_out)), dtype = theano.config.floatX)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
70 self.W = theano.shared(value = W_values)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
71
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
72 b_values = numpy.zeros((n_out,), dtype= theano.config.floatX)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
73 self.b = theano.shared(value= b_values)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
74
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
75 self.output = activation(T.dot(input, self.W) + self.b)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
76 # parameters of the model
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
77 self.params = [self.W, self.b]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
78
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
79
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
80 class MLP(object):
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
81 """Multi-Layer Perceptron Class
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
82
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
83 A multilayer perceptron is a feedforward artificial neural network model
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
84 that has one layer or more of hidden units and nonlinear activations.
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
85 Intermidiate layers usually have as activation function thanh or the
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
86 sigmoid function (defined here by a ``SigmoidalLayer`` class) while the
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
87 top layer is a softamx layer (defined here by a ``LogisticRegression``
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
88 class).
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
89 """
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
90
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
91
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
92
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
93 def __init__(self, rng, input, n_in, n_hidden, n_out):
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
94 """Initialize the parameters for the multilayer perceptron
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
95
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
96 :type rng: numpy.random.RandomState
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
97 :param rng: a random number generator used to initialize weights
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
98
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
99 :type input: theano.tensor.TensorType
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
100 :param input: symbolic variable that describes the input of the
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
101 architecture (one minibatch)
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
102
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
103 :type n_in: int
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
104 :param n_in: number of input units, the dimension of the space in
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
105 which the datapoints lie
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
106
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
107 :type n_hidden: int
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
108 :param n_hidden: number of hidden units
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
109
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
110 :type n_out: int
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
111 :param n_out: number of output units, the dimension of the space in
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
112 which the labels lie
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
113
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
114 """
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
115
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
116 # Since we are dealing with a one hidden layer MLP, this will
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
117 # translate into a TanhLayer connected to the LogisticRegression
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
118 # layer; this can be replaced by a SigmoidalLayer, or a layer
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
119 # implementing any other nonlinearity
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
120 self.hiddenLayer = HiddenLayer(rng = rng, input = input,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
121 n_in = n_in, n_out = n_hidden,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
122 activation = T.tanh)
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
123
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
124 # The logistic regression layer gets as input the hidden units
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
125 # of the hidden layer
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
126 self.logRegressionLayer = LogisticRegression(
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
127 input = self.hiddenLayer.output,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
128 n_in = n_hidden,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
129 n_out = n_out)
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
130
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
131 # L1 norm ; one regularization option is to enforce L1 norm to
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
132 # be small
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
133 self.L1 = abs(self.hiddenLayer.W).sum() \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
134 + abs(self.logRegressionLayer.W).sum()
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
135
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
136 # square of L2 norm ; one regularization option is to enforce
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
137 # square of L2 norm to be small
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
138 self.L2_sqr = (self.hiddenLayer.W**2).sum() \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
139 + (self.logRegressionLayer.W**2).sum()
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
140
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
141 # negative log likelihood of the MLP is given by the negative
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
142 # log likelihood of the output of the model, computed in the
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
143 # logistic regression layer
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
144 self.negative_log_likelihood = self.logRegressionLayer.negative_log_likelihood
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
145 # same holds for the function computing the number of errors
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
146 self.errors = self.logRegressionLayer.errors
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
147
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
148 # the parameters of the model are the parameters of the two layer it is
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
149 # made out of
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
150 self.params = self.hiddenLayer.params + self.logRegressionLayer.params
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
151
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
152
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
153 def test_mlp( learning_rate=0.01, L1_reg = 0.00, L2_reg = 0.0001, n_epochs=1000,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
154 dataset = 'mnist.pkl.gz'):
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
155 """
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
156 Demonstrate stochastic gradient descent optimization for a multilayer
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
157 perceptron
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
158
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
159 This is demonstrated on MNIST.
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
160
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
161 :type learning_rate: float
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
162 :param learning_rate: learning rate used (factor for the stochastic
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
163 gradient
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
164
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
165 :type L1_reg: float
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
166 :param L1_reg: L1-norm's weight when added to the cost (see
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
167 regularization)
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
168
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
169 :type L2_reg: float
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
170 :param L2_reg: L2-norm's weight when added to the cost (see
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
171 regularization)
2
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
172
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
173 :type n_epochs: int
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
174 :param n_epochs: maximal number of epochs to run the optimizer
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
175
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
176 :type dataset: string
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
177 :param dataset: the path of the MNIST dataset file from
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
178 http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
179
2
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
180
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
181 """
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
182 datasets = load_data(dataset)
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
183
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
184 train_set_x, train_set_y = datasets[0]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
185 valid_set_x, valid_set_y = datasets[1]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
186 test_set_x , test_set_y = datasets[2]
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
187
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
188
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
189
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
190 batch_size = 20 # size of the minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
191
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
192 # compute number of minibatches for training, validation and testing
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
193 n_train_batches = train_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
194 n_valid_batches = valid_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
195 n_test_batches = test_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
196
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
197 ######################
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
198 # BUILD ACTUAL MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
199 ######################
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
200 print '... building the model'
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
201
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
202 # allocate symbolic variables for the data
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
203 index = T.lscalar() # index to a [mini]batch
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
204 x = T.matrix('x') # the data is presented as rasterized images
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
205 y = T.ivector('y') # the labels are presented as 1D vector of
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
206 # [int] labels
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
207
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
208 rng = numpy.random.RandomState(1234)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
209
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
210 # construct the MLP class
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
211 classifier = MLP( rng = rng, input=x, n_in=28*28, n_hidden = 500, n_out=10)
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
212
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
213 # the cost we minimize during training is the negative log likelihood of
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
214 # the model plus the regularization terms (L1 and L2); cost is expressed
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
215 # here symbolically
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
216 cost = classifier.negative_log_likelihood(y) \
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
217 + L1_reg * classifier.L1 \
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
218 + L2_reg * classifier.L2_sqr
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
219
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
220 # compiling a Theano function that computes the mistakes that are made
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
221 # by the model on a minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
222 test_model = theano.function(inputs = [index],
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
223 outputs = classifier.errors(y),
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
224 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
225 x:test_set_x[index*batch_size:(index+1)*batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
226 y:test_set_y[index*batch_size:(index+1)*batch_size]})
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
227
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
228 validate_model = theano.function(inputs = [index],
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
229 outputs = classifier.errors(y),
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
230 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
231 x:valid_set_x[index*batch_size:(index+1)*batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
232 y:valid_set_y[index*batch_size:(index+1)*batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
233
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
234 # compute the gradient of cost with respect to theta (sotred in params)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
235 # the resulting gradients will be stored in a list gparams
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
236 gparams = []
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
237 for param in classifier.params:
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
238 gparam = T.grad(cost, param)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
239 gparams.append(gparam)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
240
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
241
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
242 # specify how to update the parameters of the model as a dictionary
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
243 updates = {}
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
244 # given two list the zip A = [ a1,a2,a3,a4] and B = [b1,b2,b3,b4] of
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
245 # same length, zip generates a list C of same size, where each element
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
246 # is a pair formed from the two lists :
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
247 # C = [ (a1,b1), (a2,b2), (a3,b3) , (a4,b4) ]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
248 for param, gparam in zip(classifier.params, gparams):
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
249 updates[param] = param - learning_rate*gparam
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
250
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
251 # compiling a Theano function `train_model` that returns the cost, but
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
252 # in the same time updates the parameter of the model based on the rules
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
253 # defined in `updates`
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
254 train_model =theano.function( inputs = [index], outputs = cost,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
255 updates = updates,
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
256 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
257 x:train_set_x[index*batch_size:(index+1)*batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
258 y:train_set_y[index*batch_size:(index+1)*batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
259
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
260 ###############
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
261 # TRAIN MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
262 ###############
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
263 print '... training'
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
264
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
265 # early-stopping parameters
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
266 patience = 10000 # look as this many examples regardless
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
267 patience_increase = 2 # wait this much longer when a new best is
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
268 # found
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
269 improvement_threshold = 0.995 # a relative improvement of this much is
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
270 # considered significant
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
271 validation_frequency = min(n_train_batches,patience/2)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
272 # go through this many
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
273 # minibatche before checking the network
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
274 # on the validation set; in this case we
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
275 # check every epoch
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
276
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
277
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
278 best_params = None
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
279 best_validation_loss = float('inf')
2
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
280 best_iter = 0
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
281 test_score = 0.
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
282 start_time = time.clock()
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
283
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
284 epoch = 0
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
285 done_looping = False
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
286
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
287 while (epoch < n_epochs) and (not done_looping):
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
288 epoch = epoch + 1
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
289 for minibatch_index in xrange(n_train_batches):
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
290
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
291 minibatch_avg_cost = train_model(minibatch_index)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
292 # iteration number
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
293 iter = epoch * n_train_batches + minibatch_index
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
294
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
295 if (iter+1) % validation_frequency == 0:
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
296 # compute zero-one loss on validation set
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
297 validation_losses = [validate_model(i) for i in xrange(n_valid_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
298 this_validation_loss = numpy.mean(validation_losses)
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
299
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
300 print('epoch %i, minibatch %i/%i, validation error %f %%' % \
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
301 (epoch, minibatch_index+1,n_train_batches, \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
302 this_validation_loss*100.))
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
303
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
304
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
305 # if we got the best validation score until now
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
306 if this_validation_loss < best_validation_loss:
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
307 #improve patience if loss improvement is good enough
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
308 if this_validation_loss < best_validation_loss * \
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
309 improvement_threshold :
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
310 patience = max(patience, iter * patience_increase)
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
311
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
312 best_validation_loss = this_validation_loss
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
313 # test it on the test set
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
314
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
315 test_losses = [test_model(i) for i in xrange(n_test_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
316 test_score = numpy.mean(test_losses)
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
317
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
318 print((' epoch %i, minibatch %i/%i, test error of best '
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
319 'model %f %%') % \
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
320 (epoch, minibatch_index+1, n_train_batches,test_score*100.))
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
321
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
322 if patience <= iter :
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
323 done_looping = True
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
324 break
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
325
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
326
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
327 end_time = time.clock()
2
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
328 print(('Optimization complete. Best validation score of %f %% '
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
329 'obtained at iteration %i, with test performance %f %%') %
bcc87d3e33a3 adding latest tutorial code
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 0
diff changeset
330 (best_validation_loss * 100., best_iter, test_score*100.))
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
331 print ('The code ran for %f minutes' % ((end_time-start_time)/60.))
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
332
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
333
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
334 if __name__ == '__main__':
165
4bc5eeec6394 Updating the tutorial code to the latest revisions.
Dumitru Erhan <dumitru.erhan@gmail.com>
parents: 18
diff changeset
335 test_mlp()
0
fda5f787baa6 commit initial
Dumitru Erhan <dumitru.erhan@gmail.com>
parents:
diff changeset
336