# HG changeset patch # User Joseph Turian # Date 1210222454 14400 # Node ID f6505ec32dc3c57f1212b87997ccae3b8489fc6f # Parent 57e6492644ec5b90f471411543b18b07bb2908f7 Updated documentation slightly diff -r 57e6492644ec -r f6505ec32dc3 dataset.py --- a/dataset.py Wed May 07 21:40:15 2008 -0400 +++ b/dataset.py Thu May 08 00:54:14 2008 -0400 @@ -69,7 +69,7 @@ but when the dataset is a stream (unbounded length), it is not recommanded to do such things because the underlying dataset may refuse to access the different fields in an unsynchronized ways. Hence the fields() method is illegal for streams, by default. - The result of fields() is a DataSetFields object, which iterates over fields, + The result of fields() is a L{DataSetFields} object, which iterates over fields, and whose elements are iterable over examples. A DataSetFields object can be turned back into a DataSet with its examples() method:: dataset2 = dataset1.fields().examples() diff -r 57e6492644ec -r f6505ec32dc3 learner.py --- a/learner.py Wed May 07 21:40:15 2008 -0400 +++ b/learner.py Thu May 08 00:54:14 2008 -0400 @@ -4,11 +4,12 @@ from theano import tensor as t class Learner(AttributesHolder): - """Base class for learning algorithms, provides an interface + """ + Base class for learning algorithms, provides an interface that allows various algorithms to be applicable to generic learning algorithms. - A Learner can be seen as a learning algorithm, a function that when + A L{Learner} can be seen as a learning algorithm, a function that when applied to training data returns a learned function, an object that can be applied to other data and return some output data. """ @@ -33,7 +34,7 @@ The result is a function that can be applied on data, with the same semantics of the Learner.use method. - The user may optionally provide a training StatsCollector that is used to record + The user may optionally provide a training L{StatsCollector} that is used to record some statistics of the outputs computed during training. It is update(d) during training. """ @@ -53,14 +54,14 @@ put_stats_in_output_dataset=True, output_attributes=[]): """ - Once a Learner has been trained by one or more call to 'update', it can - be used with one or more calls to 'use'. The argument is an input DataSet (possibly - containing a single example) and the result is an output DataSet of the same length. + Once a L{Learner} has been trained by one or more call to 'update', it can + be used with one or more calls to 'use'. The argument is an input L{DataSet} (possibly + containing a single example) and the result is an output L{DataSet} of the same length. If output_fieldnames is specified, it may be use to indicate which fields should - be constructed in the output DataSet (for example ['output','classification_error']). + be constructed in the output L{DataSet} (for example ['output','classification_error']). Otherwise, self.defaultOutputFields is called to choose the output fields. Optionally, if copy_inputs, the input fields (of the input_dataset) can be made - visible in the output DataSet returned by this method. + visible in the output L{DataSet} returned by this method. Optionally, attributes of the learner can be copied in the output dataset, and statistics computed by the stats collector also put in the output dataset. Note the distinction between fields (which are example-wise quantities, e.g. 'input') @@ -258,7 +259,7 @@ class MinibatchUpdatesTLearner(TLearner): """ - This adds to TLearner a + This adds to L{TLearner} a - updateStart(), updateEnd(), updateMinibatch(minibatch), isLastEpoch(): functions executed at the beginning, the end, in the middle (for each minibatch) of the update method, and at the end @@ -285,7 +286,7 @@ def allocate(self, minibatch): """ - This function is called at the beginning of each updateMinibatch + This function is called at the beginning of each L{updateMinibatch} and should be used to check that all required attributes have been allocated and initialized (usually this function calls forget() when it has to do an initialization). @@ -358,15 +359,12 @@ class OnlineGradientTLearner(MinibatchUpdatesTLearner): """ - Specialization of MinibatchUpdatesTLearner in which the minibatch updates + Specialization of L{MinibatchUpdatesTLearner} in which the minibatch updates are obtained by performing an online (minibatch-based) gradient step. Sub-classes must define the following: - - self._learning_rate (may be changed by the sub-class between epochs or minibatches) - - self.lossAttribute() = name of the loss field - + - self._learning_rate (may be changed by the sub-class between epochs or minibatches) + - self.lossAttribute() = name of the loss field """ def __init__(self,truly_online=False): """ diff -r 57e6492644ec -r f6505ec32dc3 linear_regression.py --- a/linear_regression.py Wed May 07 21:40:15 2008 -0400 +++ b/linear_regression.py Thu May 08 00:54:14 2008 -0400 @@ -1,10 +1,13 @@ +""" +Implementation of linear regression, with or without L2 regularization. +This is one of the simplest example of L{learner}, and illustrates +the use of theano. +""" from learner import * from theano import tensor as t from theano.scalar import as_scalar -# this is one of the simplest example of learner, and illustrates -# the use of theano class LinearRegression(MinibatchUpdatesTLearner): """ Implement linear regression, with or without L2 regularization diff -r 57e6492644ec -r f6505ec32dc3 mlp.py --- a/mlp.py Wed May 07 21:40:15 2008 -0400 +++ b/mlp.py Thu May 08 00:54:14 2008 -0400 @@ -1,11 +1,14 @@ +""" +A straightforward classicial feedforward +one-hidden-layer neural net, with L2 regularization. +This is one of the simplest example of L{Learner}, and illustrates +the use of theano. +""" from learner import * from theano import tensor as t from nnet_ops import * -# this is one of the simplest example of learner, and illustrates -# the use of theano - class OneHiddenLayerNNetClassifier(OnlineGradientTLearner): """