annotate doc/v2_planning/API_learner.txt @ 1253:826d78f0135f

Prototype for "hooks" simpler than full control-flow rewrite.
author Pascal Lamblin <lamblinp@iro.umontreal.ca>
date Fri, 24 Sep 2010 01:46:12 -0400
parents 317049b21b77
children
rev   line source
1175
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
1 .. _v2planning_learner:
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
2
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
3 Learner API
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
4 ===========
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
5
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
6 A list of "task types"
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
7 ----------------------
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
8
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
9 Attributes
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
10 ~~~~~~~~~~
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
11
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
12 - sequential
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
13 - spatial
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
14 - structured
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
15 - semi-supervised
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
16 - missing-values
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
17
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
18
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
19 Supervised (x,y)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
20 ~~~~~~~~~~~~~~~~
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
21
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
22 - classification
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
23 - regression
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
24 - probabilistic classification
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
25 - ranking
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
26 - conditional density estimation
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
27 - collaborative filtering
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
28 - ordinal regression ?= ranking
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
29
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
30 Unsupervised (x)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
31 ~~~~~~~~~~~~~~~~
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
32
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
33 - de-noising
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
34 - feature learning ( transformation ) PCA, DAA
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
35 - density estimation
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
36 - inference
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
37
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
38 Other
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
39 ~~~~~
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
40
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
41 - generation (sampling)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
42 - structure learning ???
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
43
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
44
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
45 Notes on metrics & statistics
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
46 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
47 - some are applied to an example, others on a batch
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
48 - most statistics are on the dataset
1175
805e7c369fd1 small change to fix warning and allow the file to generate a HTML page.
Frederic Bastien <nouiz@nouiz.org>
parents: 1168
diff changeset
49
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
50
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
51 The Learner class
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
52 -----------------
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
53
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
54 .. code-block:: python
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
55
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
56 class Learner(Object):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
57 '''
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
58 Takes data as inputs, and learns a prediction function (or several).
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
59
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
60 A learner is parametrized by hyper-parameters, which can be set from the
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
61 outside (a "client" from Learner, that can be a HyperLearner, a
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
62 Tester,...).
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
63
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
64 The data can be given all at a time as a data set, or incrementally.
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
65 Some learner need to be fully trained in one step, whereas other can be
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
66 trained incrementally.
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
67
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
68 The question of statistics collection during training remains open.
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
69 '''
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
70 #def use_dataset(dataset)
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
71
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
72 # return a dictionary of hyperparameters names(keys)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
73 # and value(values)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
74 def get_hyper_parameters():
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
75 ...
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
76 def set_hyper_parameters(dictionary):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
77 ...
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
78
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
79
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
80 # Ver B
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
81 def eval(dataset):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
82 ...
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
83 def predict(dataset):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
84 ...
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
85
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
86 # Trainable
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
87 def train(dataset): # train until completion
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
88 ...
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
89
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
90 # Incremental
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
91 def use_dataset(dataset):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
92 ...
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
93 def adapt(n_steps=1):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
94 ...
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
95 def has_converged():
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
96 ...
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
97 #
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
98
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
99
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
100 Some example cases
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
101 ------------------
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
102
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
103 .. code-block:: python
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
104
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
105 class HyperLearner(Learner):
1087
8c448829db30 learning committee first draft of an api
Razvan Pascanu <r.pascanu@gmail.com>
parents:
diff changeset
106
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
107 ### def get_hyper_parameter_distribution(name)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
108 def set_hyper_parameters_distribution(dictionary):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
109 ...
1168
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
110
77b6ed85d3f7 Update doc of learner's API
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1167
diff changeset
111
1240
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
112 def bagging(learner_factory):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
113 for i in range(N):
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
114 learner_i = learner_factory.new()
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
115 # todo: get dataset_i ??
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
116 learner_i.use_dataset(dataset_i)
317049b21b77 RST in API_learner
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1175
diff changeset
117 learner_i.train()