annotate doc/v2_planning/learner.txt @ 1364:01157763c2d7

Reply to Razvan
author Olivier Delalleau <delallea@iro>
date Fri, 12 Nov 2010 11:36:30 -0500
parents 0e12ea6ba661
children
rev   line source
1041
38cc6e075d9b PV added to learner committee
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1038
diff changeset
1
38cc6e075d9b PV added to learner committee
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1038
diff changeset
2 Comittee: AB, PL, GM, IG, RP, NB, PV
1167
7a8dcf87d780 Rename learn_meeting.py to API_learner.txt
Pascal Lamblin <lamblinp@iro.umontreal.ca>
parents: 1059
diff changeset
3 Leader: PL
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
4
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
5 Discussion of Function Specification for Learner Types
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
6 ======================================================
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
7
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
8 In its most abstract form, a learner is an object with the
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
9 following semantics:
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
10
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
11 * A learner has named hyper-parameters that control how it learns (these can be viewed
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
12 as options of the constructor, or might be set directly by a user)
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
13
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
14 * A learner also has an internal state that depends on what it has learned.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
15
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
16 * A learner reads and produces data, so the definition of learner is
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
17 intimately linked to the definition of dataset (and task).
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
18
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
19 * A learner has one or more 'train' or 'adapt' functions by which
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
20 it is given a sample of data (typically either the whole training set, or
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
21 a mini-batch, which contains as a special case a single 'example'). Learners
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
22 interface with datasets in order to obtain data. These functions cause the
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
23 learner to change its internal state and take advantage to some extent
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
24 of the data provided. The 'train' function should take charge of
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
25 completely exploiting the dataset, as specified per the hyper-parameters,
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
26 so that it would typically be called only once. An 'adapt' function
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
27 is meant for learners that can operate in an 'online' setting where
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
28 data continually arrive and the control loop (when to stop) is to
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
29 be managed outside of it. For most intents and purposes, the
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
30 'train' function could also handle the 'online' case by providing
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
31 the controlled iterations over the dataset (which would then be
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
32 seen as a stream of examples).
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
33
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
34 * learner.train(dataset)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
35 * learner.adapt(data)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
36
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
37 * Different types of learners can then exploit their internal state
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
38 in order to perform various computations after training is completed,
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
39 or in the middle of training, e.g.,
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
40
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
41 * y=learner.predict(x)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
42 for learners that see (x,y) pairs during training and predict y given x,
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
43 or for learners that see only x's and learn a transformation of it (i.e. feature extraction).
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
44 Here and below, x and y are tensor-like objects whose first index iterates
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
45 over particular examples in a batch or minibatch of examples.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
46
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
47 * p=learner.probability(examples)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
48 p=learner.log_probability(examples)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
49 for learners that can estimate probability density or probability functions,
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
50 note that example could be a pair (x,y) for learners that expect each example
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
51 to represent such a pair. The second form is provided in case the example
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
52 is high-dimensional and computations in the log-domain are numerically preferable.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
53 The first dimension of examples or of x and y is an index over a minibatch or a dataset.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
54
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
55 * p=learner.free_energy(x)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
56 for learners that can estimate a log unnormalized probability; the output has the same length as the input.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
57
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
58 * c=learner.costs(examples)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
59 returns a matrix of costs (one row per example, i.e., again the output has the same length
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
60 as the input), the first column of which represents the cost whose expectation
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
61 we wish to minimize over new samples from the unknown underlying data distribution.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
62
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
63
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
64 Some learners may be able to handle x's and y's that contain missing values.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
65
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
66 * For convenience, some of these operations could be bundled, e.g.
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
67
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
68 * [prediction,costs] = learner.predict_and_adapt((x,y))
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
69
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
70 * Some learners could include in their internal state not only what they
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
71 have learned but some information about recently seen examples that conditions
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
72 the expected distribution of upcoming examples. In that case, they might
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
73 be used, e.g. in an online setting as follows:
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
74
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
75 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
76
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
77 for (x,y) in data_stream:
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
78 [prediction,costs]=learner.predict((x,y))
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
79 accumulate_statistics(prediction,costs)
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
80
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
81 * In some cases, each example is itself a (possibly variable-size) sequence
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
82 or other variable-size object (e.g. an image, or a video)
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
83
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
84
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
85
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
86
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
87
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
88
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
89
1002
f82093bf4405 adding learner.txt and dataset.txt in v2_planning/
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
90
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
91
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
92
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
93 James's idea for Learner Interface
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
94 ===================================
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
95
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
96 Theory:
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
97 -------
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
98
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
99 Think about the unfolding of a learning algorithm as exploring a path in a vast
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
100 directed graph.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
101
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
102 There are some source nodes, which are potential initial conditions for the
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
103 learning algorithm.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
104
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
105 At any node, there are a number of outgoing labeled edges that represent
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
106 distinct directions of exploration: like "allocate a model with N hidden units",
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
107 or "set the l1 weight decay on such-and-such units to 0.1" or "adapt for T
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
108 iterations" or "refresh the GPU dataset memory with the next batch of data".
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
109
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
110 Not all nodes have the same outgoing edge labels. The dataset, model, and
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
111 optimization algorithm implementations may each have their various
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
112 hyper-parameters with various restrictions on what values they can take, and
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
113 when they can be changed.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
114
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
115 Every move in this graph incurs some storage and computational expense, and
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
116 explores the graph.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
117
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
118 Learners typically engage in goal-directed exploration of this graph - for
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
119 example to find the node with the best validation-set performance given a
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
120 certain computational budget. We might often be interested in the best node
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
121 found.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
122
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
123 The predict(), log_probability(), free_energy() etc correspond to costs that we
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
124 can measure at any particular node (at some computational expense) to see how we
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
125 are doing in our exploration.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
126
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
127 Many semantically distinct components come into the definition of this graph:
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
128 the model (e.g. DAA) the dataset (e.g. an online one), the inference and
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
129 learning strategy. I'm not sure what to call this graph than an 'experiment
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
130 graph'... so I'll go with that for now.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
131
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
132
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
133
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
134
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
135
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
136 Use Cases
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
137 ----------
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
138
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
139 Early stopping
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
140 ~~~~~~~~~~~~~~
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
141
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
142 Early stopping can be implemented as a learner that progresses along a
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
143 particular kind of edge (e.g. "train more") until a stopping criterion (in terms
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
144 of a cost computed from nodes along the path) is met.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
145
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
146
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
147 Grid Search
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
148 ~~~~~~~~~~~
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
149
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
150 Grid search is a learner policy that can be implemented in an experiment graph
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
151 where all paths have the form:
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
152
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
153 ( "set param 0 to X",
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
154 "set param 1 to Y",
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
155 ... ,
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
156 "set param N to Z",
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
157 adapt,
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
158 [early stop...],
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
159 test)
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
160
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
161 It would explore all paths of this form and then return the best node.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
162
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
163
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
164 Stagewise learning of DBNs combined with early stopping and grid search
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
165 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
166
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
167 This would be a learner that is effective for experiment graphs that reflect the
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
168 greedy-stagewise optimization of DBNs.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
169
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
170
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
171 Boosting
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
172 ~~~~~~~~
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
173
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
174 Given an ExperimentGraph that permits re-weighting of examples, it is
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
175 straightforward to write a meta-ExperimentGraph around it that implements AdaBoost.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
176 A meta-meta-ExperimentGraph around that that does early-stopping would complete
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
177 the picture and make a useful boosting implementation.
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
178
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
179
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
180 Using External Hyper-Parameter Optimization Software
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
181 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
182 TODO: use-case - show how we could use the optimizer from
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
183 http://www.cs.ubc.ca/labs/beta/Projects/ParamILS/
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
184
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
185
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
186 Implementation Details / API
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
187 ----------------------------
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
188
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
189 Learner
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
190 ~~~~~~~
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
191 An object that allows us to explore the graph discussed above. Specifically, it represents
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
192 an explored node in that graph.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
193
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
194 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
195
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
196 def active_instructions()
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
197 """ Return a list/set of Instruction instances (see below) that the Learner is prepared
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
198 to handle.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
199 """
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
200
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
201 def copy(), deepcopy()
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
202 """ Learners should be serializable """
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
203
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
204
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
205 To make the implementation easier, I found it was helpful to introduce a string-valued
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
206 `fsa_state` member attribute and associate methods to these states. That made it
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
207 syntactically easy to build relatively complex finite-state transition graphs to describe
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
208 which instructions were active at which times in the life-cycle of a learner.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
209
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
210
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
211 Instruction
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
212 ~~~~~~~~~~~
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
213 An object that represents a potential edge in the graph discussed above. It is an
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
214 operation that a learner can perform.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
215
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
216 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
217
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
218 arg_types
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
219 """a list of Type object (see below) indicating what args are required by execute"""
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
220
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
221 def execute(learner, args, kwargs):
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
222 """ Perform some operation on the learner (follow an edge in the graph discussed above)
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
223 and modify the learner in-place. Calling execute 'moves' the learner from one node in
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
224 the graph along an edge. To have the old learner as well, it must be copied prior to
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
225 calling execute().
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
226 """
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
227
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
228 def expense(learner, args, kwargs, resource_type='CPUtime'):
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
229 """ Return an estimated cost of performing this instruction (calling execute), in time,
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
230 space, number of computers, disk requierement, etc.
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
231 """
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
232
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
233 Type
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
234 ~~~~
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
235 An object that describes a parameter domain for a call to Instruction.execute.
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
236 It is not necessary that a Type specifies exactly which arguments are legal, but it should
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
237 `include` all legal arguments, and exclude as many illegal ones as possible.
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
238
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
239 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
240
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
241 def includes(value):
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
242 """return True if value is a legal argument"""
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
243
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
244
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
245 To make things a bit more practical, there are some Type subclasses like Int, Float, Str,
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
246 ImageDataset, SgdOptimizer, that include additional attributes (e.g. min, max, default) so
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
247 that automatic graph exploration algorithms can generate legal arguments with reasonable
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
248 efficiency.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
249
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
250
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
251
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
252 The proxy pattern is a powerful way to combine learners. Especially when proxy Learner
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
253 instances also introduce Proxy Instruction classes.
1026
38f799f8b6cd v2_planning - thoughts on learner
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1002
diff changeset
254
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
255 For example, it is straightforward to implement a hyper-learner by implementing a Learner with
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
256 another learner (sub-learner) as a member attribute. The hyper-learner makes some
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
257 modifications to the instruction_set() return value of the sub-learner, typically to introduce
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
258 more powerful instructions and hide simpler ones.
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
259
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
260 It is less straightforward, but consistent with the design to implement a Learner that
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
261 encompasses job management. Such a learner would retain the semantics of the
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
262 instruction_set of the sub-learner, but would replace the Instruction objects themselves with
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
263 Instructions that arranged for remote procedure calls (e.g. jobman, multiprocessing, bqtools,
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
264 etc.) Such a learner would replace synchronous instructions (return on completion) with
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
265 asynchronous ones (return after scheduling) and the active instruction set would also change
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
266 asynchronously, but neither of these things is inconsistent with the Learner API.
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
267
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
268
1058
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
269 TODO - Experiment API?
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
270 ~~~~~~~~~~~~~~~~~~~~~~
1043
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
271
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
272 I feel like something is missing from the API - and that is an interface to the graph structure
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
273 discussed above. The nodes in this graph are natural places to store meta-information for
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
274 visualization, statistics-gathering etc. But none of the APIs above corresponds to the graph
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
275 itself. In other words, there is no API through which to attach information to nodes. It is
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
276 not good to say that the Learner instance *is* the node because (a) learner instances change
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
277 during graph exploration and (b) learner instances are big, and we don't want to have to keep a
3f528656855b v2planning learner.txt - updated API recommendation
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1041
diff changeset
278 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills
1058
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
279 over into other committees, so we should get their feedback about how to resolve
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
280 it. Maybe we need an 'Experiment' API to stand for this graph?
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
281
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
282
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
283 TODO: Validation & Monitoring Costs
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
284 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
285
1058
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
286 Even if we do have the Experiment API as a structure to hang validation and
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
287 monitoring results, what should be the mechanism for extracting those results.
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
288 The Learner API is not right because extracting a monitoring cost doesn't change
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
289 the model, doesn't change the legal instructions/edges etc. Maybe we should use
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
290 a similar mechanism to Instruction, called something like Measurement? Any node
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
291 / learner can report the list of instructions (for moving) and the list of
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
292 measurements (and the cost of computing them too)
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
293
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
294
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
295 TODO - Parameter Distributions
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
296 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1055
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
297
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
298 YB asks: it seems to me that what we really need from "Type" is not just
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
299 testing that a value is legal, but more practically a function that specifies the
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
300 prior distribution for the hyper-parameter, i.e., how to sample from it,
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
301 and possibly some representation of it that could be used to infer
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
302 a posterior (such as an unnormalized log-density or log-probability).
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
303 Having the min and max and default limits us to the uniform distribution,
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
304 which may not always be appropriate. For example sometimes we'd like
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
305 Gaussian (-infty to infty) or Exponential (0 to infty) or Poisson (non-negative integers).
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
306 For that reason, I think that "Type" is not a very good name.
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
307 How about "Prior" or "Density" or something like that?
bc3f7834db83 added a comment/question about Type
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1053
diff changeset
308
1058
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
309 JB replies: I agree that being able to choose (and update) distributions over
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
310 these values is important. I don't think the Type structure is the right place
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
311 to handle it though. The challenge is to allow those distributions to change
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
312 for a variety of reasons - e.g. the sampling distribution on the capacity
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
313 variables is affected by the size of the dataset, it is also affected by
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
314 previous experience in general as well as experiments on that particular
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
315 dataset. I'm not sure that the 'Type' structure is right to deal with this.
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
316 Also, even with a strategy for handling these distributions, I believe a simple
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
317 mechanism for rejecting insane values might be useful.
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
318
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
319 So how should we handle it? Hmmm...
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
320
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
321
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
322 Comments
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
323 ~~~~~~~~
e342de3ae485 v2planning learner - added comments and TODO points
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1055
diff changeset
324
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
325 OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm
1045
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
326 interested to see how the learner interface is shaping up so I'll be keeping
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
327 an eye on this file)
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
328 I'm wondering what's the benefit of such an API compared to simply defining a
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
329 new method for each instruction. It seems to me that typically, the 'execute'
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
330 method would end up being something like
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
331
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
332 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
333
1045
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
334 if instruction == 'do_x':
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
335 self.do_x(..)
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
336 elif instruction == 'do_y':
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
337 self.do_y(..)
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
338 ...
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
339
1045
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
340 so why not directly call do_x / do_y instead?
d57bdd9a9980 learner: Left a comment about James' design
Olivier Delalleau <delallea@iro>
parents: 1044
diff changeset
341
1046
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
342
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
343 JB replies: I agree with you, and in the implementation of a Learner I suggest
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
344 using Python decorators to get the best of both worlds:
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
345
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
346 .. code-block:: python
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
347
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
348 class NNet(Learner):
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
349
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
350 ...
1046
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
351
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
352 @Instruction.new(arg_types=(Float(min=-8, max=-1, default=-4),))
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
353 def set_log_lr(self, log_lr):
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
354 self.lr.value = numpy.exp(log_lr)
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
355
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
356 ...
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
357
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
358 The Learner base class can implement a instruction_set() that walks through the
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
359 methods of 'self' and pick out the ones that have corresponding instructions.
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
360 But anyone can call the method normally. The NNet class can also have methods
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
361 that are not instructions.
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
362
1053
390166ace9e5 learner: Reply to James
Olivier Delalleau <delallea@iro>
parents: 1052
diff changeset
363 OD replies: Ok thanks. I'm still unsure what is the end goal, but I'll keep
390166ace9e5 learner: Reply to James
Olivier Delalleau <delallea@iro>
parents: 1052
diff changeset
364 watching while you guys work on it, and hopefully it'll become clearer for me ;)
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
365
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
366 RP asks: James correct me if I'm wrong, but I think each instruction has a execute
1046
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
367 command. The job of the learner is to traverse the graph and for each edge
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
368 that it decides to cross to call the execute of that edge. Maybe James has
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
369 something else in mind, but this was my understanding.
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
370
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
371 JB replies: close, but let me make a bit of a clarification. The job of a
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
372 Learner is simply to implement the API of a Learner - to list what edges are
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
373 available and to be able to cross them if asked. The code *using* the Learner
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
374 (client) decides which edges to cross. The client may also be a Learner, but
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
375 maybe not.
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
376
1046
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
377
f1732269bce8 comment on Olivier's comment
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1045
diff changeset
378
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
379 Just another view/spin on the same idea (Razvan)
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
380 ================================================
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
381
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
382
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
383 My idea is probably just a spin off from what James wrote. It is an extension
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
384 of what I send on the mailing list some time ago.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
385
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
386 Big Picture
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
387 -----------
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
388
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
389 What do we care about ?
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
390 ~~~~~~~~~~~~~~~~~~~~~~~
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
391
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
392 This is the list of the main points that I have in mind :
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
393
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
394 * Re-usability
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
395 * Extensibility
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
396 * Simplicity or easily readable code ( connected to re-usability )
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
397 * Modularity ( connected to extensibility )
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
398 * Fast to write code ( - sort of comes out of simplicity)
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
399 * Efficient code
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
400
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
401
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
402 Composition
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
403 ~~~~~~~~~~~
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
404
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
405 To me this reads as code generated by composing pieces. Imagine this :
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
406 you start of with something primitive that I will call a "variable", which
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
407 probably is a very unsuitable name. And then you compose those intial
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
408 "variables" or transform them through several "functions". Each such
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
409 "function" hides some logic, that you as the user don't care about.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
410 You can have low-level or micro "functions" and high-level or macro
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
411 "functions", where a high-level function is just a certain compositional
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
412 pattern of low-level "functions". There are several classes of "functions"
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
413 and "variables" that can be inter-changable. This is how modularity is
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
414 obtained, by chainging between functions from a certain class.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
415
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
416 Now when you want to research something, what you do is first select
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
417 the step you want to look into. If you are lucky you can re-write this
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
418 step as certain decomposition of low-level transformations ( there can be
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
419 multiple such decompositions). If not you have to implement such a
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
420 decompositions acording to your needs. Pick the low-level transformations you want
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
421 to change and write new versions that implement your logic.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
422
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
423 I think the code will be easy to read, because it is just applying a fixed
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
424 set of transformations, one after the other. The one who writes the code can
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
425 decide how explicit he wants to write things by switching between high-level
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
426 and low-level functions.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
427
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
428 I think the code this way is re-usable, because you can just take this chain
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
429 of transformation and replace the one you care about, without looking into
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
430 the rest.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
431
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
432 You get this fractal property of the code. Zooming in, you always get just
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
433 a set of functions applied to a set of variables. In the begining those might
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
434 not be there, and you would have to create new "low level" decompositions,
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
435 maybe even new "variables" that get data between those decompositions.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
436
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
437 The thing with variables here, is that I don't want this "functions" to have
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
438 a state. All the information is passed along through these variables. This
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
439 way understanding the graph is easy, debugging it is also easier ( then having
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
440 all these hidden states ..)
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
441
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
442 Note that while doing so we might ( and I strongly think we should) create
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
443 a (symbolic) DAG of operations. ( this is where it becomes what James was saying).
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
444 In such a DAG the "variables" will the nodes and the functions will be edges.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
445 I think having a DAG is useful in many ways (all this are things that one
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
446 might think about implementing in a far future, I'm not proposing to implement
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
447 them unless we want to use them - like the reconstruction ):
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
448
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
449 * there exist the posibility of writing optimizations ( theano style )
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
450 * there exist the posibility to add global view utility functions ( like
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
451 a reconstruction function for SdA - extremely low level here), or global
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
452 view diagnostic tools
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
453 * the posibility of creating a GUI ( where you just create the Graph by
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
454 picking transforms and variables from a list ) or working interactively
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
455 and then generating code that will reproduce the graph
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
456 * you can view the graph and different granularity levels to understand
1189
0e12ea6ba661 fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents: 1167
diff changeset
457 things ( global diagnostics)
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
458
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
459 We should have a taxonomy of possible classes of functions and possible
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
460 classes of variables, but those should not be exclusive. We can work at a high
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
461 level for now, and decompose those high level functions to lower level when
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
462 we need to. We can introduce new classes of functions or intermediate
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
463 variables between those low level functions.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
464
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
465
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
466 Similarities with James' idea
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
467 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
468
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
469 As I said before, this is I think just another view on what James proposed.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
470 The learner in his case is the module that traverses the graph of this
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
471 operations, which makes sense here as well.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
472
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
473 The 'execute' command in his api is just applying a function to some variables in
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
474 my case.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
475
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
476 The learner keeps track of the graph that is formed I think in both cases.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
477
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
478 His view is a bit more general. I see the graph as fully created by the user,
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
479 and the learner just has to go from the start to the end. In his case the
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
480 traversal is conditioned on some policies. I think these ideas can be mixed /
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
481 united. What I would see in my case to have this functionality is something
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
482 similar to the lazy linker for Theano.
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
483
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
484
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
485
1052
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
486 JB asks: There is definitely a strong theme of graphs in both suggestions,
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
487 furthermore graphs that have heavy-duty nodes and light-weight edges. But I
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
488 don't necessarily think that we're proposing the same thing. One difference is
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
489 that the graph I talked about would be infinite in most cases of interest, so
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
490 it's not going to be representable by Theano's data structures (even with lazy
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
491 if). Another difference is that the graph I had in mind doesn't feel fractal -
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
492 it would be very common for a graph edge to be atomic. A proxy pattern, such as
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
493 in a hyper-learner would create a notion of being able to zoom in, but other
84f62533e7a8 v2planning learner - reply to comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1046
diff changeset
494 than that, i'm not sure what you mean.
1044
3b1fd599bafd my first draft of my own views which are close to be just a reformulation of what James proposes
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1043
diff changeset
495
1056
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
496 RP replies: I've been thinking about my idea a bit and yes, it might be
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
497 quite different from what James has in mind, though there are plently of common
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
498 elements. I might have exagerated a bit with the zooming in, so in some cases
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
499 you will end up with atomic edges, though my hope is that is not most of the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
500 edges.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
501
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
502 I think I should go into mode details when answering this question because
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
503 I feel I have not explained things sufficiently clear. Note, in many places
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
504 I replaced the word "function" by "transform".
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
505
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
506 Think of the learner as an object that traverses a DAG of steps created by the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
507 user. On this DAG the learner can potentially do a lot of cool stuff, but we
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
508 won't care about that for now. The DAG can be infinite in principle, and what
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
509 the learner does is just to go on the path described by the user ( and here
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
510 described is not through heuristics like in James case, but by giving the list
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
511 of edges it needs to follow). A potential cool thing the learner can do is to
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
512 regard the path given by the user as a suggestion ( or some form of heuristic)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
513 and try to improve it. This would be much closer to what James has in mind,
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
514 and I definetely think is a cool way to go about it.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
515
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
516 Now this path in the graph is given by the user by composing subgraphs or
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
517 adding nodes to the graph. Or (expressing this in a more simple way) by applying
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
518 functions to variables. Any such function will introduce an edge ( or a subgraph) that
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
519 will connect the vertices corresponding to the input variables to the vertices
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
520 corresponding to the output variables. The variables store the state of the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
521 learner. These functions are state-less, I think if you would give them states
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
522 you will make this approach really ugly (I might be wrong).
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
523 The variables would contain informations required by the function, like
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
524 number of layers, on how many cores to run, cluster configurations, and so on.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
525
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
526 Now about the zooming part, that James asked. I might have exagerated a bit,
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
527 is not that you can zoom in on any part infinitely. You will end up with
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
528 things that are atomic. The idea is that any such "transformation" or edge
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
529 has the potential to be split up in several "transformations". This offers
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
530 (in my view) a way of solving the time constraints of our project. We can
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
531 start by difining a coarse division in segments. For now we can have
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
532 a structure transform that makes a list of parameters into a deep
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
533 network of some type, then a learner transform that adds SGD + pre-training
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
534 on top of network, and then early stopper on top of that, and then a
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
535 run_on_cluster on that.We would probably want something more finely grained
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
536 even from the start .. this is just to prove my point. When any of us
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
537 starts experimenting with a certain sub-step of this process ( like the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
538 structure) we will split that transform into several ( like ones that create
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
539 a layer and so on) that make sense for that case, and then start working on
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
540 the low level transform that we cares ( like the layer) introducing new
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
541 versions of it. I think we can not find a universal split that will cover
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
542 all of our cases, so I think we should allow different such splits. The one
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
543 who researches should look at what low-level transforms are available and use
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
544 those if they make sense, if not he would have to create a different split.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
545 Creating a different split might involve a lot of work and taking care of
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
546 several issues so it should be done with care.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
547
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
548 I'll give an example from where I started thinking this way. Let say we want
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
549 to do the SdA with auxiliary inputs that encourages separation of the features
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
550 in the hidden layer that Yoshua was saying ( I had an attempt
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
551 at it some time ago for speech but I never eneded up finishing that project).
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
552
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
553 You start up with something like :
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
554
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
555 learner = Learner()
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
556 # This will create the learner that will traverse our graph. We might
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
557 # want it to be a function ``execute``, I just randomly picked this option.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
558 #I have no preference of this detail for now .. this is mostly work in progress
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
559
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
560 data = someSpeechData(path = 'some path')
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
561 # This is such a transform that will generate from the string representing the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
562 # path a dataset variable ( that will contain all informations you need to
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
563 # access data). This will probably be the object the datasets comittee will
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
564 # provide. Note, you might need to provide more information then the path, but
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
565 # you can easily see how to do that. All these stuff start from simple
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
566 # variables like path, batch size and so on and return a complex heavy duty
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
567 # variable (node).
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
568
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
569
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
570 model = earlyStopping(pretrain(SdA(layers = [524, 500, 500,27], noise = [0.1,0.1]),data, epochs = 10), data)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
571 # This is a composition of two transforms. The SdA transform starts from the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
572 # info about layers and corruption /noise for each layer and construct a SdA.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
573 # This is a high level transform, so it will take care of defining all
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
574 # details, like pre-training, defining the cost and so on. Note that maybe it will
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
575 # require some more parameters .. you can assume that for anything else there
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
576 # is a default value that the SdA will use. earlyStopping is yet another
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
577 # transform that takes a model ( that we know how to train ) and some data,
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
578 # and does early stoppign on it. For bravity I did not provide all the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
579 # information required like patience and so on. The SdA only knows how to do a
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
580 # step of training. Same holds for pretrain. It will loop over the layers of
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
581 # SdA and will train each one.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
582
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
583 steps = cluster(model, getPropertiesAndRanges(model), n_jobs = 20, cluster_info = getClusterInfo())
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
584 # This will lunch the wanted jobs. getPropertiesAndRanges will get from a
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
585 # model all knobs that need to be turn, and their ranges and will uniformly
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
586 # sample from them in each jobs. getCluterInfo will return a variable
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
587 # containing informations about the cluster ( I added this for simplicity, it
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
588 # should probably be replaced with something like username, password,
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
589 # clusterpath or whatever).
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
590
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
591 learner.execute(steps)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
592 # As an option, each of this output variables could contain the entire graph
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
593 # until that point. We could also have this in a different way .. this is
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
594 # adhoc at the moment
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
595
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
596
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
597 Now this is a coarse vanila SdA which is not what we wanted. We do not have a
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
598 way of incorporating our auxiliary information in this. So what we have to do
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
599 is split/change the SdA transform. We would re-write it as :
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
600
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
601
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
602 arch = SdA(layers = [524, 500, 500, 27], noise = [0.1,0.1])
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
603 model = earlyStopping(pretrain(arch,data,epochs = 10)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
604 ...
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
605
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
606 And then re-write things like :
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
607
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
608 arch = SGD( cross_entropy( logreg( DAAlayer( [DAAlayer([524,500],0.1),500],0.1))))
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
609
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
610
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
611 We would re-write the DAAlayer as :
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
612
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
613 layer0 = DAAlayer([524,500],0.1)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
614 layer1 = cross_entropy(reconstruct( tanh(dotW_b( layer0,500)),noise = 0.1))
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
615
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
616 At this point of detail, we can start inserting our new stuff in as follows :
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
617
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
618
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
619 input = empty_layer(600)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
620 # empty layer is a wrapper ; if I would to write dotW_b(200,500) which means
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
621 # go from a layer of 200 units to a one of 500 by multiplying with a matrix
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
622 # and adding a bias, what I would mean is dotW_b( empty_layer(200), 500).
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
623 # an implementation of empty_layer could be just theano.tensor.vector()
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
624 # where we add the size tag ( we will need it later)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
625
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
626
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
627 hidden0_mfcc = dotW_b(input[0:524],100)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
628 hidden0_noise = dotW_b(input[0:560],50)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
629 hidden0_speakerID = dotW_b(join(input[0:524], input[560:600]),50)
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
630 hidden0 = tanh(join( layer0_mfcc, layer0_noise, layer0_speakerID))
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
631 layer0 = cross_entropy( reconstruct( hidden0, noise = 0.1))
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
632
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
633 and so on. Hopefully you got what I mean by spliting a transform, or zooming
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
634 in. When doing all this we did not change anything about the early stopping or
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
635 lunching jobs on the cluster. In the same manner, if one would like to look
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
636 into how jobs are send to the cluster, it could just expand that part. Note
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
637 that if we wanted to do something else we might have split the DAA
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
638 differently.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
639
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
640 The key of this approach is to identify such low level units that can be
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
641 shared by 90% of our architectures, and the splits that make most sense
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
642 from a functional point of view that will cover the main points where people
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
643 will like to change things. This will ensure that almost all the time we have
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
644 the wanted low-level bits that we want to write our code into, and most of the
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
645 time we will only work on one of that bit. There will definetely be cases when
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
646 whatever we have will not be sufficient or convinient. In that case some
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
647 effort has to be invested by the user to create a different decomposition of
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
648 the problem in the elements he need.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
649
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
650 I've been thinking about this a bit, and it definetely works in for deep
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
651 networks and theano ( the approach was inspired by theano). From what James
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
652 said, I think that other stuff might be possible to incorporate, at least as
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
653 atomic transforms if not in any other way.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
654
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
655 TODO: one has to give some thought of this low-level transform, to find a
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
656 suitable set of them ( and variables) so that would end up most of the time
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
657 re-using things and not creating new things.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
658
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
659 NOTES: there are some other implementation details missing of what this state
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
660 variables should contain. I did not want to clutter this with what tricks
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
661 could be used to get this transparent interface. I have a few of them in mind
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
662 though..
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
663 there is a lot of hardcoded values in this example. Usually each transform
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
664 that takes an input should "know" which of these inputs are tunable and mark
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
665 them as such. The order of the input in this example is important as well.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
666 This can be easily solved at the expense of a few more lines of code that
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
667 I did not want to write.
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
668
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
669
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
670
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
671
19033ef1636d some more details on my approach
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1055
diff changeset
672