ift6266: code_tutoriel/SdA.py annotate

annotate code_tutoriel/SdA.py @ 618:14ba0120baff

review response changes

author	Yoshua Bengio <bengioy@iro.umontreal.ca>
date	Sun, 09 Jan 2011 14:13:23 -0500
parents	4bc5eeec6394
children

rev	line source
165 4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	1 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	2 This tutorial introduces stacked denoising auto-encoders (SdA) using Theano.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	3
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	4 Denoising autoencoders are the building blocks for SdA.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	5 They are based on auto-encoders as the ones used in Bengio et al. 2007.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	6 An autoencoder takes an input x and first maps it to a hidden representation
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	7 y = f_{\theta}(x) = s(Wx+b), parameterized by \theta={W,b}. The resulting
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	8 latent representation y is then mapped back to a "reconstructed" vector
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	9 z \in [0,1]^d in input space z = g_{\theta'}(y) = s(W'y + b'). The weight
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	10 matrix W' can optionally be constrained such that W' = W^T, in which case
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	11 the autoencoder is said to have tied weights. The network is trained such
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	12 that to minimize the reconstruction error (the error between x and z).
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	13
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	14 For the denosing autoencoder, during training, first x is corrupted into
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	15 \tilde{x}, where \tilde{x} is a partially destroyed version of x by means
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	16 of a stochastic mapping. Afterwards y is computed as before (using
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	17 \tilde{x}), y = s(W\tilde{x} + b) and z as s(W'y + b'). The reconstruction
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	18 error is now measured between z and the uncorrupted input x, which is
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	19 computed as the cross-entropy :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	20 - \sum_{k=1}^d[ x_k \log z_k + (1-x_k) \log( 1-z_k)]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	21
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	22
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	23 References :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	24 - P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol: Extracting and
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	25 Composing Robust Features with Denoising Autoencoders, ICML'08, 1096-1103,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	26 2008
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	27 - Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle: Greedy Layer-Wise
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	28 Training of Deep Networks, Advances in Neural Information Processing
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	29 Systems 19, 2007
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	30
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	31 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	32
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	33 import numpy, time, cPickle, gzip
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	34
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	35 import theano
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	36 import theano.tensor as T
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	37 from theano.tensor.shared_randomstreams import RandomStreams
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	38
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	39 from logistic_sgd import LogisticRegression, load_data
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	40 from mlp import HiddenLayer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	41 from dA import dA
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	42
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	43
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	44
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	45 class SdA(object):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	46 """Stacked denoising auto-encoder class (SdA)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	47
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	48 A stacked denoising autoencoder model is obtained by stacking several
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	49 dAs. The hidden layer of the dA at layer `i` becomes the input of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	50 the dA at layer `i+1`. The first layer dA gets as input the input of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	51 the SdA, and the hidden layer of the last dA represents the output.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	52 Note that after pretraining, the SdA is dealt with as a normal MLP,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	53 the dAs are only used to initialize the weights.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	54 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	55
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	56 def __init__(self, numpy_rng, theano_rng = None, n_ins = 784,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	57 hidden_layers_sizes = [500,500], n_outs = 10,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	58 corruption_levels = [0.1, 0.1]):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	59 """ This class is made to support a variable number of layers.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	60
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	61 :type numpy_rng: numpy.random.RandomState
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	62 :param numpy_rng: numpy random number generator used to draw initial
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	63 weights
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	64
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	65 :type theano_rng: theano.tensor.shared_randomstreams.RandomStreams
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	66 :param theano_rng: Theano random generator; if None is given one is
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	67 generated based on a seed drawn from `rng`
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	68
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	69 :type n_ins: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	70 :param n_ins: dimension of the input to the sdA
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	71
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	72 :type n_layers_sizes: list of ints
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	73 :param n_layers_sizes: intermidiate layers size, must contain
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	74 at least one value
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	75
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	76 :type n_outs: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	77 :param n_outs: dimension of the output of the network
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	78
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	79 :type corruption_levels: list of float
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	80 :param corruption_levels: amount of corruption to use for each
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	81 layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	82 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	83
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	84 self.sigmoid_layers = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	85 self.dA_layers = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	86 self.params = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	87 self.n_layers = len(hidden_layers_sizes)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	88
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	89 assert self.n_layers > 0
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	90
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	91 if not theano_rng:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	92 theano_rng = RandomStreams(numpy_rng.randint(2**30))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	93 # allocate symbolic variables for the data
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	94 self.x = T.matrix('x') # the data is presented as rasterized images
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	95 self.y = T.ivector('y') # the labels are presented as 1D vector of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	96 # [int] labels
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	97
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	98 # The SdA is an MLP, for which all weights of intermidiate layers
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	99 # are shared with a different denoising autoencoders
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	100 # We will first construct the SdA as a deep multilayer perceptron,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	101 # and when constructing each sigmoidal layer we also construct a
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	102 # denoising autoencoder that shares weights with that layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	103 # During pretraining we will train these autoencoders (which will
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	104 # lead to chainging the weights of the MLP as well)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	105 # During finetunining we will finish training the SdA by doing
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	106 # stochastich gradient descent on the MLP
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	107
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	108 for i in xrange( self.n_layers ):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	109 # construct the sigmoidal layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	110
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	111 # the size of the input is either the number of hidden units of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	112 # the layer below or the input size if we are on the first layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	113 if i == 0 :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	114 input_size = n_ins
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	115 else:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	116 input_size = hidden_layers_sizes[i-1]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	117
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	118 # the input to this layer is either the activation of the hidden
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	119 # layer below or the input of the SdA if you are on the first
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	120 # layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	121 if i == 0 :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	122 layer_input = self.x
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	123 else:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	124 layer_input = self.sigmoid_layers[-1].output
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	125
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	126 sigmoid_layer = HiddenLayer(rng = numpy_rng,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	127 input = layer_input,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	128 n_in = input_size,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	129 n_out = hidden_layers_sizes[i],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	130 activation = T.nnet.sigmoid)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	131 # add the layer to our list of layers
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	132 self.sigmoid_layers.append(sigmoid_layer)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	133 # its arguably a philosophical question...
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	134 # but we are going to only declare that the parameters of the
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	135 # sigmoid_layers are parameters of the StackedDAA
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	136 # the visible biases in the dA are parameters of those
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	137 # dA, but not the SdA
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	138 self.params.extend(sigmoid_layer.params)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	139
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	140 # Construct a denoising autoencoder that shared weights with this
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	141 # layer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	142 dA_layer = dA(numpy_rng = numpy_rng, theano_rng = theano_rng, input = layer_input,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	143 n_visible = input_size,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	144 n_hidden = hidden_layers_sizes[i],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	145 W = sigmoid_layer.W, bhid = sigmoid_layer.b)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	146 self.dA_layers.append(dA_layer)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	147
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	148
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	149 # We now need to add a logistic layer on top of the MLP
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	150 self.logLayer = LogisticRegression(\
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	151 input = self.sigmoid_layers[-1].output,\
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	152 n_in = hidden_layers_sizes[-1], n_out = n_outs)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	153
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	154 self.params.extend(self.logLayer.params)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	155 # construct a function that implements one step of finetunining
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	156
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	157 # compute the cost for second phase of training,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	158 # defined as the negative log likelihood
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	159 self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	160 # compute the gradients with respect to the model parameters
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	161 # symbolic variable that points to the number of errors made on the
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	162 # minibatch given by self.x and self.y
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	163 self.errors = self.logLayer.errors(self.y)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	164
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	165 def pretraining_functions(self, train_set_x, batch_size):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	166 ''' Generates a list of functions, each of them implementing one
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	167 step in trainnig the dA corresponding to the layer with same index.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	168 The function will require as input the minibatch index, and to train
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	169 a dA you just need to iterate, calling the corresponding function on
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	170 all minibatch indexes.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	171
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	172 :type train_set_x: theano.tensor.TensorType
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	173 :param train_set_x: Shared variable that contains all datapoints used
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	174 for training the dA
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	175
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	176 :type batch_size: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	177 :param batch_size: size of a [mini]batch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	178
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	179 :type learning_rate: float
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	180 :param learning_rate: learning rate used during training for any of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	181 the dA layers
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	182 '''
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	183
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	184 # index to a [mini]batch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	185 index = T.lscalar('index') # index to a minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	186 corruption_level = T.scalar('corruption') # amount of corruption to use
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	187 learning_rate = T.scalar('lr') # learning rate to use
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	188 # number of batches
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	189 n_batches = train_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	190 # begining of a batch, given `index`
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	191 batch_begin = index * batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	192 # ending of a batch given `index`
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	193 batch_end = batch_begin+batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	194
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	195 pretrain_fns = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	196 for dA in self.dA_layers:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	197 # get the cost and the updates list
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	198 cost,updates = dA.get_cost_updates( corruption_level, learning_rate)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	199 # compile the theano function
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	200 fn = theano.function( inputs = [index,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	201 theano.Param(corruption_level, default = 0.2),
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	202 theano.Param(learning_rate, default = 0.1)],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	203 outputs = cost,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	204 updates = updates,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	205 givens = {self.x :train_set_x[batch_begin:batch_end]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	206 # append `fn` to the list of functions
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	207 pretrain_fns.append(fn)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	208
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	209 return pretrain_fns
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	210
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	211
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	212 def build_finetune_functions(self, datasets, batch_size, learning_rate):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	213 '''Generates a function `train` that implements one step of
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	214 finetuning, a function `validate` that computes the error on
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	215 a batch from the validation set, and a function `test` that
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	216 computes the error on a batch from the testing set
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	217
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	218 :type datasets: list of pairs of theano.tensor.TensorType
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	219 :param datasets: It is a list that contain all the datasets;
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	220 the has to contain three pairs, `train`,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	221 `valid`, `test` in this order, where each pair
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	222 is formed of two Theano variables, one for the
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	223 datapoints, the other for the labels
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	224
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	225 :type batch_size: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	226 :param batch_size: size of a minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	227
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	228 :type learning_rate: float
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	229 :param learning_rate: learning rate used during finetune stage
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	230 '''
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	231
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	232 (train_set_x, train_set_y) = datasets[0]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	233 (valid_set_x, valid_set_y) = datasets[1]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	234 (test_set_x , test_set_y ) = datasets[2]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	235
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	236 # compute number of minibatches for training, validation and testing
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	237 n_valid_batches = valid_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	238 n_test_batches = test_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	239
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	240 index = T.lscalar('index') # index to a [mini]batch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	241
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	242 # compute the gradients with respect to the model parameters
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	243 gparams = T.grad(self.finetune_cost, self.params)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	244
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	245 # compute list of fine-tuning updates
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	246 updates = {}
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	247 for param, gparam in zip(self.params, gparams):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	248 updates[param] = param - gparam*learning_rate
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	249
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	250 train_fn = theano.function(inputs = [index],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	251 outputs = self.finetune_cost,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	252 updates = updates,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	253 givens = {
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	254 self.x : train_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	255 self.y : train_set_y[indexbatch_size:(index+1)batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	256
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	257 test_score_i = theano.function([index], self.errors,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	258 givens = {
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	259 self.x: test_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	260 self.y: test_set_y[indexbatch_size:(index+1)batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	261
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	262 valid_score_i = theano.function([index], self.errors,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	263 givens = {
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	264 self.x: valid_set_x[indexbatch_size:(index+1)batch_size],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	265 self.y: valid_set_y[indexbatch_size:(index+1)batch_size]})
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	266
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	267 # Create a function that scans the entire validation set
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	268 def valid_score():
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	269 return [valid_score_i(i) for i in xrange(n_valid_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	270
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	271 # Create a function that scans the entire test set
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	272 def test_score():
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	273 return [test_score_i(i) for i in xrange(n_test_batches)]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	274
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	275 return train_fn, valid_score, test_score
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	276
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	277
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	278
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	279
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	280
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	281
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	282 def test_SdA( finetune_lr = 0.1, pretraining_epochs = 15, \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	283 pretrain_lr = 0.1, training_epochs = 1000, \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	284 dataset='mnist.pkl.gz'):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	285 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	286 Demonstrates how to train and test a stochastic denoising autoencoder.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	287
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	288 This is demonstrated on MNIST.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	289
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	290 :type learning_rate: float
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	291 :param learning_rate: learning rate used in the finetune stage
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	292 (factor for the stochastic gradient)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	293
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	294 :type pretraining_epochs: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	295 :param pretraining_epochs: number of epoch to do pretraining
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	296
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	297 :type pretrain_lr: float
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	298 :param pretrain_lr: learning rate to be used during pre-training
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	299
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	300 :type n_iter: int
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	301 :param n_iter: maximal number of iterations ot run the optimizer
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	302
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	303 :type dataset: string
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	304 :param dataset: path the the pickled dataset
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	305
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	306 """
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	307
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	308 datasets = load_data(dataset)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	309
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	310 train_set_x, train_set_y = datasets[0]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	311 valid_set_x, valid_set_y = datasets[1]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	312 test_set_x , test_set_y = datasets[2]
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	313
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	314
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	315 batch_size = 20 # size of the minibatch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	316
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	317 # compute number of minibatches for training, validation and testing
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	318 n_train_batches = train_set_x.value.shape[0] / batch_size
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	319
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	320 # numpy random generator
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	321 numpy_rng = numpy.random.RandomState(123)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	322 print '... building the model'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	323 # construct the stacked denoising autoencoder class
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	324 sda = SdA( numpy_rng = numpy_rng, n_ins = 28*28,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	325 hidden_layers_sizes = [1000,1000,1000],
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	326 n_outs = 10)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	327
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	328
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	329 #########################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	330 # PRETRAINING THE MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	331 #########################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	332 print '... getting the pretraining functions'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	333 pretraining_fns = sda.pretraining_functions(
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	334 train_set_x = train_set_x,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	335 batch_size = batch_size )
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	336
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	337 print '... pre-training the model'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	338 start_time = time.clock()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	339 ## Pre-train layer-wise
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	340 for i in xrange(sda.n_layers):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	341 # go through pretraining epochs
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	342 for epoch in xrange(pretraining_epochs):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	343 # go through the training set
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	344 c = []
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	345 for batch_index in xrange(n_train_batches):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	346 c.append( pretraining_fns[i](index = batch_index,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	347 corruption = 0.2, lr = pretrain_lr ) )
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	348 print 'Pre-training layer %i, epoch %d, cost '%(i,epoch),numpy.mean(c)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	349
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	350 end_time = time.clock()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	351
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	352 print ('Pretraining took %f minutes' %((end_time-start_time)/60.))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	353
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	354 ########################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	355 # FINETUNING THE MODEL #
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	356 ########################
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	357
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	358 # get the training, validation and testing function for the model
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	359 print '... getting the finetuning functions'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	360 train_fn, validate_model, test_model = sda.build_finetune_functions (
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	361 datasets = datasets, batch_size = batch_size,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	362 learning_rate = finetune_lr)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	363
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	364 print '... finetunning the model'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	365 # early-stopping parameters
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	366 patience = 10000 # look as this many examples regardless
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	367 patience_increase = 2. # wait this much longer when a new best is
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	368 # found
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	369 improvement_threshold = 0.995 # a relative improvement of this much is
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	370 # considered significant
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	371 validation_frequency = min(n_train_batches, patience/2)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	372 # go through this many
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	373 # minibatche before checking the network
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	374 # on the validation set; in this case we
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	375 # check every epoch
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	376
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	377
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	378 best_params = None
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	379 best_validation_loss = float('inf')
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	380 test_score = 0.
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	381 start_time = time.clock()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	382
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	383 done_looping = False
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	384 epoch = 0
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	385
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	386 while (epoch < training_epochs) and (not done_looping):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	387 epoch = epoch + 1
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	388 for minibatch_index in xrange(n_train_batches):
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	389
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	390 minibatch_avg_cost = train_fn(minibatch_index)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	391 iter = epoch * n_train_batches + minibatch_index
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	392
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	393 if (iter+1) % validation_frequency == 0:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	394
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	395 validation_losses = validate_model()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	396 this_validation_loss = numpy.mean(validation_losses)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	397 print('epoch %i, minibatch %i/%i, validation error %f %%' % \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	398 (epoch, minibatch_index+1, n_train_batches, \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	399 this_validation_loss*100.))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	400
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	401
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	402 # if we got the best validation score until now
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	403 if this_validation_loss < best_validation_loss:
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	404
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	405 #improve patience if loss improvement is good enough
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	406 if this_validation_loss < best_validation_loss * \
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	407 improvement_threshold :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	408 patience = max(patience, iter * patience_increase)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	409
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	410 # save best validation score and iteration number
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	411 best_validation_loss = this_validation_loss
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	412 best_iter = iter
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	413
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	414 # test it on the test set
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	415 test_losses = test_model()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	416 test_score = numpy.mean(test_losses)
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	417 print((' epoch %i, minibatch %i/%i, test error of best '
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	418 'model %f %%') %
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	419 (epoch, minibatch_index+1, n_train_batches,
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	420 test_score*100.))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	421
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	422
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	423 if patience <= iter :
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	424 done_looping = True
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	425 break
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	426
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	427 end_time = time.clock()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	428 print(('Optimization complete with best validation score of %f %%,'
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	429 'with test performance %f %%') %
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	430 (best_validation_loss * 100., test_score*100.))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	431 print ('The code ran for %f minutes' % ((end_time-start_time)/60.))
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	432
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	433
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	434
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	435
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	436
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	437
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	438 if __name__ == '__main__':
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	439 test_SdA()
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	440
4bc5eeec6394 Updating the tutorial code to the latest revisions. Dumitru Erhan <dumitru.erhan@gmail.com> parents: diff changeset	441

Mercurial > ift6266

annotate code_tutoriel/SdA.py @ 618:14ba0120baff