annotate doc/v2_planning/plugin.txt @ 1119:81ea57c6716d

clarification to plugin.txt
author Yoshua Bengio <bengioy@iro.umontreal.ca>
date Tue, 14 Sep 2010 17:22:25 -0400
parents 8cc324f388ba
children a1957faecc9b
rev   line source
1118
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
1
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
2 ======================================
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
3 Plugin system for iterative algorithms
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
4 ======================================
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
5
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
6 I would like to propose a plugin system for iterative algorithms in
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
7 Pylearn. Basically, it would be useful to be able to sandwich
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
8 arbitrary behavior in-between two training iterations of an algorithm
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
9 (whenever applicable). I believe many mechanisms are best implemented
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
10 this way: early stopping, saving checkpoints, tracking statistics,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
11 real time visualization, remote control of the process, or even
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
12 interlacing the training of several models and making them interact
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
13 with each other.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
14
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
15 So here is the proposal: essentially, a plugin would be a (schedule,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
16 timeline, function) tuple.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
17
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
18 Schedule
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
19 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
20
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
21 The schedule is some function that takes two "times", t1 and t2, and
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
22 returns True if the plugin should be run in-between these times. The
1119
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
23 indices refer to a "timeline" unit described below (e.g. "real time" or
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
24 "iterations"). The reason why we check a time range [t1, t2] rather than
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
25 some discrete time t is that we do not necessarily want to schedule plugins
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
26 on iteration numbers. For instance, we could want to run a plugin every
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
27 second, or every minute, and then [t1, t2] would be the start time and end
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
28 time of the last iteration - and then we run the plugin whenever a new
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
29 second started in that range (but still on training iteration
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
30 boundaries). Alternatively, we could want to run a plugin every n examples
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
31 seen - but if we use mini-batches, the nth example might be square in the
81ea57c6716d clarification to plugin.txt
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1118
diff changeset
32 middle of a batch.
1118
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
33
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
34 I've implemented a somewhat elaborate schedule system. `each(10)`
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
35 produces a schedule that returns true whenever a multiple of 10 is in
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
36 the time range. `at(17, 153)` produces one that returns true when 17
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
37 or 143 is in the time range. Schedules can be combined and negated,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
38 e.g. `each(10) & ~at(20, 30)` (execute at each 10, except at 20 and
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
39 30). So that gives a lot of flexibility as to when you want to do
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
40 things.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
41
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
42 Timeline
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
43 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
44
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
45 This would be a string indicating on what "timeline" the schedule is
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
46 supposed to operate. For instance, there could be a "real time"
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
47 timeline, an "algorithm time" timeline, an "iterations" timeline, a
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
48 "number of examples" timeline, and so on. This means you can schedule
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
49 some action to be executed every actual second, or every second of
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
50 training time (ignoring time spent executing plugins), or every
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
51 discrete iteration, or every n examples processed. This might be a
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
52 bloat feature (it was an afterthought to my original design, anyway),
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
53 but I think that there are circumstances where each of these options
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
54 is the best one.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
55
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
56 Function
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
57 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
58
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
59 The plugin function would receive some object containing the time
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
60 range, a flag indicating whether the training has started, a flag
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
61 indicating whether the training is done (which they can set in order
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
62 to stop training), as well as anything pertinent about the model.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
63
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
64 Implementation
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
65 ==============
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
66
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
67 I have implemented the feature in plugin.py, in this directory. Simply
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
68 run python plugin.py to test it.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
69