ift6266: code_tutoriel/mlp.py annotate

annotate code_tutoriel/mlp.py @ 239:42005ec87747

Mergé (manuellement) les changements de Sylvain pour utiliser le code de dataset d'Arnaud, à cette différence près que je n'utilse pas les givens. J'ai probablement une approche différente pour limiter la taille du dataset dans mon débuggage, aussi.

author	fsavard
date	Mon, 15 Mar 2010 18:30:21 -0400
parents	4bc5eeec6394
children

rev	line source
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	1 """
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	2 This tutorial introduces the multilayer perceptron using Theano.
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	3
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	4 A multilayer perceptron is a logistic regressor where
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	5 instead of feeding the input to the logistic regression you insert a
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	6 intermidiate layer, called the hidden layer, that has a nonlinear
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	7 activation function (usually tanh or sigmoid) . One can use many such
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	8 hidden layers making the architecture deep. The tutorial will also tackle
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	9 the problem of MNIST digit classification.
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	10
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	11 .. math::
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	12
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	13 f(x) = G( b^{(2)} + W^{(2)}( s( b^{(1)} + W^{(1)} x))),
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	14
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	15 References:
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	16
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	17 - textbooks: "Pattern Recognition and Machine Learning" -
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	18 Christopher M. Bishop, section 5
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	19
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	20 """
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	21 __docformat__ = 'restructedtext en'
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	22
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	23
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	24 import numpy, time, cPickle, gzip
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	25
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	26 import theano
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	27 import theano.tensor as T
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	28
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	29
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	30 from logistic_sgd import LogisticRegression, load_data
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	31
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	32
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	33 class HiddenLayer(object):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	34 def __init__(self, rng, input, n_in, n_out, activation = T.tanh):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	35 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	36 Typical hidden layer of a MLP: units are fully-connected and have
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	37 sigmoidal activation function. Weight matrix W is of shape (n_in,n_out)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	38 and the bias vector b is of shape (n_out,).
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	39
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	40 NOTE : The nonlinearity used here is tanh
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	41
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	42 Hidden unit activation is given by: tanh(dot(input,W) + b)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	43
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	44 :type rng: numpy.random.RandomState
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	45 :param rng: a random number generator used to initialize weights
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	46
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	47 :type input: theano.tensor.dmatrix
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	48 :param input: a symbolic tensor of shape (n_examples, n_in)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	49
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	50 :type n_in: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	51 :param n_in: dimensionality of input
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	52
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	53 :type n_out: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	54 :param n_out: number of hidden units
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	55
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	56 :type activation: theano.Op or function
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	57 :param activation: Non linearity to be applied in the hidden
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	58 layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	59 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	60 self.input = input
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	61
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	62 # `W` is initialized with `W_values` which is uniformely sampled
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	63 # from -6./sqrt(n_in+n_hidden) and 6./sqrt(n_in+n_hidden)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	64 # the output of uniform if converted using asarray to dtype
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	65 # theano.config.floatX so that the code is runable on GPU
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	66 W_values = numpy.asarray( rng.uniform( \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	67 low = -numpy.sqrt(6./(n_in+n_out)), \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	68 high = numpy.sqrt(6./(n_in+n_out)), \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	69 size = (n_in, n_out)), dtype = theano.config.floatX)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	70 self.W = theano.shared(value = W_values)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	71
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	72 b_values = numpy.zeros((n_out,), dtype= theano.config.floatX)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	73 self.b = theano.shared(value= b_values)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	74
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	75 self.output = activation(T.dot(input, self.W) + self.b)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	76 # parameters of the model
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	77 self.params = [self.W, self.b]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	78
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	79
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	80 class MLP(object):
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	81 """Multi-Layer Perceptron Class
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	82
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	83 A multilayer perceptron is a feedforward artificial neural network model
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	84 that has one layer or more of hidden units and nonlinear activations.
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	85 Intermidiate layers usually have as activation function thanh or the
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	86 sigmoid function (defined here by a ``SigmoidalLayer`` class) while the
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	87 top layer is a softamx layer (defined here by a ``LogisticRegression``
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	88 class).
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	89 """
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	90
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	91
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	92
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	93 def __init__(self, rng, input, n_in, n_hidden, n_out):
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	94 """Initialize the parameters for the multilayer perceptron
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	95
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	96 :type rng: numpy.random.RandomState
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	97 :param rng: a random number generator used to initialize weights
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	98
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	99 :type input: theano.tensor.TensorType
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	100 :param input: symbolic variable that describes the input of the
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	101 architecture (one minibatch)
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	102
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	103 :type n_in: int
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	104 :param n_in: number of input units, the dimension of the space in
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	105 which the datapoints lie
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	106
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	107 :type n_hidden: int
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	108 :param n_hidden: number of hidden units
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	109
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	110 :type n_out: int
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	111 :param n_out: number of output units, the dimension of the space in
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	112 which the labels lie
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	113
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	114 """
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	115
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	116 # Since we are dealing with a one hidden layer MLP, this will
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	117 # translate into a TanhLayer connected to the LogisticRegression
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	118 # layer; this can be replaced by a SigmoidalLayer, or a layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	119 # implementing any other nonlinearity
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	120 self.hiddenLayer = HiddenLayer(rng = rng, input = input,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	121 n_in = n_in, n_out = n_hidden,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	122 activation = T.tanh)
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	123
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	124 # The logistic regression layer gets as input the hidden units
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	125 # of the hidden layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	126 self.logRegressionLayer = LogisticRegression(
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	127 input = self.hiddenLayer.output,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	128 n_in = n_hidden,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	129 n_out = n_out)
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	130
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	131 # L1 norm ; one regularization option is to enforce L1 norm to
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	132 # be small
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	133 self.L1 = abs(self.hiddenLayer.W).sum() \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	134 + abs(self.logRegressionLayer.W).sum()
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	135
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	136 # square of L2 norm ; one regularization option is to enforce
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	137 # square of L2 norm to be small
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	138 self.L2_sqr = (self.hiddenLayer.W**2).sum() \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	139 + (self.logRegressionLayer.W**2).sum()
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	140
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	141 # negative log likelihood of the MLP is given by the negative
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	142 # log likelihood of the output of the model, computed in the
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	143 # logistic regression layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	144 self.negative_log_likelihood = self.logRegressionLayer.negative_log_likelihood
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	145 # same holds for the function computing the number of errors
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	146 self.errors = self.logRegressionLayer.errors
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	147
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	148 # the parameters of the model are the parameters of the two layer it is
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	149 # made out of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	150 self.params = self.hiddenLayer.params + self.logRegressionLayer.params
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	151
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	152
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	153 def test_mlp( learning_rate=0.01, L1_reg = 0.00, L2_reg = 0.0001, n_epochs=1000,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	154 dataset = 'mnist.pkl.gz'):
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	155 """
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	156 Demonstrate stochastic gradient descent optimization for a multilayer
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	157 perceptron
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	158
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	159 This is demonstrated on MNIST.
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	160
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	161 :type learning_rate: float
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	162 :param learning_rate: learning rate used (factor for the stochastic
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	163 gradient
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	164
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	165 :type L1_reg: float
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	166 :param L1_reg: L1-norm's weight when added to the cost (see
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	167 regularization)
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	168
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	169 :type L2_reg: float
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	170 :param L2_reg: L2-norm's weight when added to the cost (see
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	171 regularization)
2 bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	172
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	173 :type n_epochs: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	174 :param n_epochs: maximal number of epochs to run the optimizer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	175
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	176 :type dataset: string
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	177 :param dataset: the path of the MNIST dataset file from
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	178 http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	179
2 bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	180
bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	181 """
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	182 datasets = load_data(dataset)
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	183
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	184 train_set_x, train_set_y = datasets[0]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	185 valid_set_x, valid_set_y = datasets[1]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	186 test_set_x , test_set_y = datasets[2]
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	187
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	188
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	189
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	190 batch_size = 20 # size of the minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	191
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	192 # compute number of minibatches for training, validation and testing
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	193 n_train_batches = train_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	194 n_valid_batches = valid_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	195 n_test_batches = test_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	196
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	197 ######################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	198 # BUILD ACTUAL MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	199 ######################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	200 print '... building the model'
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	201
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	202 # allocate symbolic variables for the data
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	203 index = T.lscalar() # index to a [mini]batch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	204 x = T.matrix('x') # the data is presented as rasterized images
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	205 y = T.ivector('y') # the labels are presented as 1D vector of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	206 # [int] labels
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	207
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	208 rng = numpy.random.RandomState(1234)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	209
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	210 # construct the MLP class
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	211 classifier = MLP( rng = rng, input=x, n_in=28*28, n_hidden = 500, n_out=10)
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	212
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	213 # the cost we minimize during training is the negative log likelihood of
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	214 # the model plus the regularization terms (L1 and L2); cost is expressed
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	215 # here symbolically
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	216 cost = classifier.negative_log_likelihood(y) \
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	217 + L1_reg * classifier.L1 \
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	218 + L2_reg * classifier.L2_sqr
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	219
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	220 # compiling a Theano function that computes the mistakes that are made
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	221 # by the model on a minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	222 test_model = theano.function(inputs = [index],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	223 outputs = classifier.errors(y),
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	224 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	225 x:test_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	226 y:test_set_y[indexbatch_size:(index+1)batch_size]})
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	227
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	228 validate_model = theano.function(inputs = [index],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	229 outputs = classifier.errors(y),
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	230 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	231 x:valid_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	232 y:valid_set_y[indexbatch_size:(index+1)batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	233
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	234 # compute the gradient of cost with respect to theta (sotred in params)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	235 # the resulting gradients will be stored in a list gparams
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	236 gparams = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	237 for param in classifier.params:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	238 gparam = T.grad(cost, param)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	239 gparams.append(gparam)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	240
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	241
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	242 # specify how to update the parameters of the model as a dictionary
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	243 updates = {}
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	244 # given two list the zip A = [ a1,a2,a3,a4] and B = [b1,b2,b3,b4] of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	245 # same length, zip generates a list C of same size, where each element
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	246 # is a pair formed from the two lists :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	247 # C = [ (a1,b1), (a2,b2), (a3,b3) , (a4,b4) ]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	248 for param, gparam in zip(classifier.params, gparams):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	249 updates[param] = param - learning_rate*gparam
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	250
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	251 # compiling a Theano function `train_model` that returns the cost, but
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	252 # in the same time updates the parameter of the model based on the rules
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	253 # defined in `updates`
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	254 train_model =theano.function( inputs = [index], outputs = cost,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	255 updates = updates,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	256 givens={
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	257 x:train_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	258 y:train_set_y[indexbatch_size:(index+1)batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	259
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	260 ###############
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	261 # TRAIN MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	262 ###############
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	263 print '... training'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	264
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	265 # early-stopping parameters
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	266 patience = 10000 # look as this many examples regardless
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	267 patience_increase = 2 # wait this much longer when a new best is
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	268 # found
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	269 improvement_threshold = 0.995 # a relative improvement of this much is
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	270 # considered significant
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	271 validation_frequency = min(n_train_batches,patience/2)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	272 # go through this many
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	273 # minibatche before checking the network
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	274 # on the validation set; in this case we
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	275 # check every epoch
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	276
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	277
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	278 best_params = None
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	279 best_validation_loss = float('inf')
2 bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	280 best_iter = 0
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	281 test_score = 0.
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	282 start_time = time.clock()
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	283
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	284 epoch = 0
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	285 done_looping = False
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	286
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	287 while (epoch < n_epochs) and (not done_looping):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	288 epoch = epoch + 1
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	289 for minibatch_index in xrange(n_train_batches):
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	290
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	291 minibatch_avg_cost = train_model(minibatch_index)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	292 # iteration number
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	293 iter = epoch * n_train_batches + minibatch_index
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	294
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	295 if (iter+1) % validation_frequency == 0:
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	296 # compute zero-one loss on validation set
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	297 validation_losses = [validate_model(i) for i in xrange(n_valid_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	298 this_validation_loss = numpy.mean(validation_losses)
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	299
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	300 print('epoch %i, minibatch %i/%i, validation error %f %%' % \
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	301 (epoch, minibatch_index+1,n_train_batches, \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	302 this_validation_loss*100.))
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	303
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	304
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	305 # if we got the best validation score until now
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	306 if this_validation_loss < best_validation_loss:
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	307 #improve patience if loss improvement is good enough
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	308 if this_validation_loss < best_validation_loss * \
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	309 improvement_threshold :
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	310 patience = max(patience, iter * patience_increase)
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	311
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	312 best_validation_loss = this_validation_loss
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	313 # test it on the test set
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	314
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	315 test_losses = [test_model(i) for i in xrange(n_test_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	316 test_score = numpy.mean(test_losses)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	317
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	318 print((' epoch %i, minibatch %i/%i, test error of best '
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	319 'model %f %%') % \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	320 (epoch, minibatch_index+1, n_train_batches,test_score*100.))
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	321
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	322 if patience <= iter :
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	323 done_looping = True
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	324 break
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	325
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	326
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	327 end_time = time.clock()
2 bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	328 print(('Optimization complete. Best validation score of %f %% '
bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	329 'obtained at iteration %i, with test performance %f %%') %
bcc87d3e33a3 adding latest tutorial code Dumitru Erhan <dumitru.erhan@gmail.com> parents: 0 diff changeset	330 (best_validation_loss * 100., best_iter, test_score*100.))
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	331 print ('The code ran for %f minutes' % ((end_time-start_time)/60.))
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	332
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	333
fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	334 if __name__ == '__main__':
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: 18 diff changeset	335 test_mlp()
0 fda5f787baa6 commit initial Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	336

Mercurial > ift6266

annotate code_tutoriel/mlp.py @ 239:42005ec87747