comparison learner.py @ 167:4803cb76e26b

Updated documentation
author Joseph Turian <turian@gmail.com>
date Mon, 12 May 2008 18:51:42 -0400
parents ceae4de18981
children fb4837eed1a6
comparison
equal deleted inserted replaced
166:ee11ed427ba8 167:4803cb76e26b
11 algorithms. 11 algorithms.
12 12
13 A L{Learner} can be seen as a learning algorithm, a function that when 13 A L{Learner} can be seen as a learning algorithm, a function that when
14 applied to training data returns a learned function (which is an object that 14 applied to training data returns a learned function (which is an object that
15 can be applied to other data and return some output data). 15 can be applied to other data and return some output data).
16
17 """ 16 """
18 17
19 def __init__(self): 18 def __init__(self):
20 pass 19 pass
21 20
167 raise AbstractFunction() 166 raise AbstractFunction()
168 167
169 168
170 class TLearner(Learner): 169 class TLearner(Learner):
171 """ 170 """
172 TLearner is a virtual class of Learners that attempts to factor out of the definition 171 TLearner is a virtual class of L{Learner}s that attempts to factor
173 of a learner the steps that are common to many implementations of learning algorithms, 172 out of the definition of a learner the steps that are common to
174 so as to leave only 'the equations' to define in particular sub-classes, using Theano. 173 many implementations of learning algorithms, so as to leave only
175 174 'the equations' to define in particular sub-classes, using Theano.
176 In the default implementations of use and update, it is assumed that the 'use' and 'update' methods 175
177 visit examples in the input dataset sequentially. In the 'use' method only one pass through the dataset is done, 176 In the default implementations of use and update, it is assumed
178 whereas the sub-learner may wish to iterate over the examples multiple times. Subclasses where this 177 that the 'use' and 'update' methods visit examples in the input
179 basic model is not appropriate can simply redefine update or use. 178 dataset sequentially. In the 'use' method only one pass through the
180 179 dataset is done, whereas the sub-learner may wish to iterate over
180 the examples multiple times. Subclasses where this basic model is
181 not appropriate can simply redefine update or use.
182
181 Sub-classes must provide the following functions and functionalities: 183 Sub-classes must provide the following functions and functionalities:
182 - attributeNames(): defines all the names of attributes which can be used as fields or 184 - attributeNames(): defines all the names of attributes which can
183 attributes in input/output datasets or in stats collectors. 185 be used as fields or
184 All these attributes are expected to be theano.Result objects 186 attributes in input/output datasets or in
185 (with a .data property and recognized by theano.Function for compilation). 187 stats collectors. All these attributes
186 The sub-class constructor defines the relations between 188 are expected to be theano.Result objects
187 the Theano variables that may be used by 'use' and 'update' 189 (with a .data property and recognized by
188 or by a stats collector. 190 theano.Function for compilation). The sub-class
189 - defaultOutputFields(input_fields): return a list of default dataset output fields when 191 constructor defines the relations between the
192 Theano variables that may be used by 'use'
193 and 'update' or by a stats collector.
194 - defaultOutputFields(input_fields): return a list of default
195 dataset output fields when
190 None are provided by the caller of use. 196 None are provided by the caller of use.
191 The following naming convention is assumed and important. 197 The following naming convention is assumed and important. Attributes
192 Attributes whose names are listed in attributeNames() can be of any type, 198 whose names are listed in attributeNames() can be of any type,
193 but those that can be referenced as input/output dataset fields or as 199 but those that can be referenced as input/output dataset fields or
194 output attributes in 'use' or as input attributes in the stats collector 200 as output attributes in 'use' or as input attributes in the stats
195 should be associated with a Theano Result variable. If the exported attribute 201 collector should be associated with a Theano Result variable. If the
196 name is <name>, the corresponding Result name (an internal attribute of 202 exported attribute name is <name>, the corresponding Result name
197 the TLearner, created in the sub-class constructor) should be _<name>. 203 (an internal attribute of the TLearner, created in the sub-class
198 Typically <name> will be numpy ndarray and _<name> will be the corresponding 204 constructor) should be _<name>. Typically <name> will be numpy
199 Theano Tensor (for symbolic manipulation). 205 ndarray and _<name> will be the corresponding Theano Tensor (for
206 symbolic manipulation).
200 207
201 @todo pousser dans Learner toute la poutine qui peut l'etre sans etre 208 @todo pousser dans Learner toute la poutine qui peut l'etre sans etre
202 dependant de Theano 209 dependant de Theano
203 """ 210 """
204 211
250 return [self.__getattribute__('_'+name) for name in names] 257 return [self.__getattribute__('_'+name) for name in names]
251 258
252 259
253 class MinibatchUpdatesTLearner(TLearner): 260 class MinibatchUpdatesTLearner(TLearner):
254 """ 261 """
255 This adds to L{TLearner} a 262 This adds the following functions to a L{TLearner}:
256 - updateStart(), updateEnd(), updateMinibatch(minibatch), isLastEpoch(): 263 - updateStart(), updateEnd(), updateMinibatch(minibatch), isLastEpoch():
257 functions executed at the beginning, the end, in the middle 264 functions executed at the beginning, the end, in the middle (for
258 (for each minibatch) of the update method, and at the end 265 each minibatch) of the update method, and at the end of each
259 of each epoch. This model only 266 epoch. This model only works for 'online' or one-shot learning
260 works for 'online' or one-shot learning that requires 267 that requires going only once through the training data. For more
261 going only once through the training data. For more complicated 268 complicated models, more specialized subclasses of TLearner should
262 models, more specialized subclasses of TLearner should be used 269 be used or a learning-algorithm specific update method should
263 or a learning-algorithm specific update method should be defined. 270 be defined.
264 271
265 - a 'parameters' attribute which is a list of parameters (whose names are 272 - a 'parameters' attribute which is a list of parameters
266 specified by the user's subclass with the parameterAttributes() method) 273 (whose names are specified by the user's subclass with the
267 274 parameterAttributes() method)
275
268 """ 276 """
269 277
270 def __init__(self): 278 def __init__(self):
271 TLearner.__init__(self) 279 TLearner.__init__(self)
272 self.update_minibatch_function = compile.function(self.names2OpResults(self.updateMinibatchOutputAttributes()+ 280 self.update_minibatch_function = compile.function(self.names2OpResults(self.updateMinibatchOutputAttributes()+