comparison learner.py @ 262:14b9779622f9

Split LearningAlgorithm into OfflineLearningAlgorithm and OnlineLearningAlgorithm
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Tue, 03 Jun 2008 21:34:24 -0400
parents bd728c83faff
children 78cc8fe3bbe9
comparison
equal deleted inserted replaced
260:792f81d65f82 262:14b9779622f9
1 1
2 2
3 from exceptions import * 3 from exceptions import *
4 from dataset import AttributesHolder 4 from dataset import AttributesHolder
5 5
6 class LearningAlgorithm(object): 6 class OfflineLearningAlgorithm(object):
7 """ 7 """
8 Base class for learning algorithms, provides an interface 8 Base class for offline learning algorithms, provides an interface
9 that allows various algorithms to be applicable to generic learning 9 that allows various algorithms to be applicable to generic learning
10 algorithms. It is only given here to define the expected semantics. 10 algorithms. It is only given here to define the expected semantics.
11 11
12 A L{Learner} can be seen as a learning algorithm, a function that when 12 An offline learning algorithm can be seen as a function that when
13 applied to training data returns a learned function (which is an object that 13 applied to training data returns a learned function (which is an object that
14 can be applied to other data and return some output data). 14 can be applied to other data and return some output data).
15 15
16 There are two main ways of using a learning algorithms, and some learning 16 The offline learning scenario is the standard and most common one
17 algorithms only support one of them. The first is the way of the standard 17 in machine learning: an offline learning algorithm is applied
18 machine learning framework, in which a learning algorithm is applied
19 to a training dataset, 18 to a training dataset,
20 19
21 model = learning_algorithm(training_set) 20 model = learning_algorithm(training_set)
22 21
23 resulting in a fully trained model that can be applied to another dataset: 22 resulting in a fully trained model that can be applied to another dataset
23 in order to perform some desired computation:
24 24
25 output_dataset = model(input_dataset) 25 output_dataset = model(input_dataset)
26 26
27 Note that the application of a dataset has no side-effect on the model. 27 Note that the application of a dataset has no side-effect on the model.
28 In that example, the training set may for example have 'input' and 'target' 28 In that example, the training set may for example have 'input' and 'target'
29 fields while the input dataset may have only 'input' (or both 'input' and 29 fields while the input dataset may have only 'input' (or both 'input' and
30 'target') and the output dataset would contain some default output fields defined 30 'target') and the output dataset would contain some default output fields defined
31 by the learning algorithm (e.g. 'output' and 'error'). 31 by the learning algorithm (e.g. 'output' and 'error'). The user may specifiy
32 what the output dataset should contain either by setting options in the
33 model, by the presence of particular fields in the input dataset, or with
34 keyword options of the __call__ method of the model (see LearnedModel.__call__).
32 35
33 The second way of using a learning algorithm is in the online or 36 """
34 adaptive framework, where the training data are only revealed in pieces 37
38 def __init__(self): pass
39
40 def __call__(self, training_dataset):
41 """
42 Return a fully trained TrainedModel.
43 """
44 raise AbstractFunction()
45
46 class TrainedModel(AttributesHolder):
47 """
48 TrainedModel is a base class for models returned by instances of an
49 OfflineLearningAlgorithm subclass. It is only given here to define the expected semantics.
50 """
51 def __init__(self):
52 pass
53
54 def __call__(self,input_dataset,output_fieldnames=None,
55 test_stats_collector=None,copy_inputs=False,
56 put_stats_in_output_dataset=True,
57 output_attributes=[]):
58 """
59 A L{TrainedModel} can be used with
60 with one or more calls to it. The main argument is an input L{DataSet} (possibly
61 containing a single example) and the result is an output L{DataSet} of the same length.
62 If output_fieldnames is specified, it may be use to indicate which fields should
63 be constructed in the output L{DataSet} (for example ['output','classification_error']).
64 Otherwise, some default output fields are produced (possibly depending on the input
65 fields available in the input_dataset).
66 Optionally, if copy_inputs, the input fields (of the input_dataset) can be made
67 visible in the output L{DataSet} returned by this method.
68 Optionally, attributes of the learner can be copied in the output dataset,
69 and statistics computed by the stats collector also put in the output dataset.
70 Note the distinction between fields (which are example-wise quantities, e.g. 'input')
71 and attributes (which are not, e.g. 'regularization_term').
72 """
73 raise AbstractFunction()
74
75
76 class OnlineLearningAlgorithm(object):
77 """
78 Base class for online learning algorithms, provides an interface
79 that allows various algorithms to be applicable to generic online learning
80 algorithms. It is only given here to define the expected semantics.
81
82 The basic setting is that the training data are only revealed in pieces
35 (maybe one example or a batch of example at a time): 83 (maybe one example or a batch of example at a time):
36 84
37 model = learning_algorithm() 85 model = learning_algorithm()
38 86
39 results in a fresh model. The model can be adapted by presenting 87 results in a fresh model. The model can be adapted by presenting
47 95
48 and at any point one can use the model to perform some computation: 96 and at any point one can use the model to perform some computation:
49 97
50 output_dataset = model(input_dataset) 98 output_dataset = model(input_dataset)
51 99
100 The model should be a LearnerModel subclass instance, and LearnerModel
101 is a subclass of LearnedModel.
102
52 """ 103 """
53 104
54 def __init__(self): pass 105 def __init__(self): pass
55 106
56 def __call__(self, training_dataset=None): 107 def __call__(self, training_dataset=None):
57 """ 108 """
58 Return a LearnerModel, either fresh (if training_dataset is None) or fully trained (otherwise). 109 Return a LearnerModel, either fresh (if training_dataset is None) or fully trained (otherwise).
59 """ 110 """
60 raise AbstractFunction() 111 raise AbstractFunction()
61 112
62 class LearnerModel(AttributesHolder): 113 class LearnerModel(LearnedModel):
63 """ 114 """
64 LearnerModel is a base class for models returned by instances of a LearningAlgorithm subclass. 115 LearnerModel is a base class for models returned by instances of a LearningAlgorithm subclass.
65 It is only given here to define the expected semantics. 116 It is only given here to define the expected semantics.
66 """ 117 """
67 def __init__(self): 118 def __init__(self):
68 pass 119 pass
69 120
70 def update(self,training_set,train_stats_collector=None): 121 def update(self,training_set,train_stats_collector=None):
71 """ 122 """
72 Continue training a learner, with the evidence provided by the given training set. 123 Continue training a learner model, with the evidence provided by the given training set.
73 Hence update can be called multiple times. This is the main method used for training in the 124 Hence update can be called multiple times. This is the main method used for training in the
74 on-line setting or the sequential (Bayesian or not) settings. 125 on-line setting or the sequential (Bayesian or not) settings.
75 126
76 This function has as side effect that self(data) will behave differently, 127 This function has as side effect that self(data) will behave differently,
77 according to the adaptation achieved by update(). 128 according to the adaptation achieved by update().
80 some statistics of the outputs computed during training. It is update(d) during 131 some statistics of the outputs computed during training. It is update(d) during
81 training. 132 training.
82 """ 133 """
83 raise AbstractFunction() 134 raise AbstractFunction()
84 135
85 def __call__(self,input_dataset,output_fieldnames=None,
86 test_stats_collector=None,copy_inputs=False,
87 put_stats_in_output_dataset=True,
88 output_attributes=[]):
89 """
90 A trained or partially trained L{Model} can be used with
91 with one or more calls to it. The argument is an input L{DataSet} (possibly
92 containing a single example) and the result is an output L{DataSet} of the same length.
93 If output_fieldnames is specified, it may be use to indicate which fields should
94 be constructed in the output L{DataSet} (for example ['output','classification_error']).
95 Otherwise, some default output fields are produced (possibly depending on the input
96 fields available in the input_dataset).
97 Optionally, if copy_inputs, the input fields (of the input_dataset) can be made
98 visible in the output L{DataSet} returned by this method.
99 Optionally, attributes of the learner can be copied in the output dataset,
100 and statistics computed by the stats collector also put in the output dataset.
101 Note the distinction between fields (which are example-wise quantities, e.g. 'input')
102 and attributes (which are not, e.g. 'regularization_term').
103 """
104 raise AbstractFunction()