annotate doc/v2_planning/plugin.txt @ 1118:8cc324f388ba

proposal for a plugin system
author Olivier Breuleux <breuleuo@iro.umontreal.ca>
date Tue, 14 Sep 2010 16:01:32 -0400
parents
children 81ea57c6716d
rev   line source
1118
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
1
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
2 ======================================
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
3 Plugin system for iterative algorithms
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
4 ======================================
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
5
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
6 I would like to propose a plugin system for iterative algorithms in
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
7 Pylearn. Basically, it would be useful to be able to sandwich
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
8 arbitrary behavior in-between two training iterations of an algorithm
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
9 (whenever applicable). I believe many mechanisms are best implemented
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
10 this way: early stopping, saving checkpoints, tracking statistics,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
11 real time visualization, remote control of the process, or even
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
12 interlacing the training of several models and making them interact
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
13 with each other.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
14
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
15 So here is the proposal: essentially, a plugin would be a (schedule,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
16 timeline, function) tuple.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
17
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
18 Schedule
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
19 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
20
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
21 The schedule is some function that takes two "times", t1 and t2, and
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
22 returns True if the plugin should be run in-between these times. The
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
23 reason why we check a time range [t1, t2] rather than some discrete
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
24 time t is that we do not necessarily want to schedule plugins on
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
25 iteration numbers. For instance, we could want to run a plugin every
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
26 second, or every minute, and then [t1, t2] would be the start time and
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
27 end time of the last iteration - and then we run the plugin whenever a
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
28 new second started in that range (but still on training iteration
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
29 boundaries). Alternatively, we could want to run a plugin every n
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
30 examples seen - but if we use mini-batches, the nth example might be
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
31 square in the middle of a batch.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
32
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
33 I've implemented a somewhat elaborate schedule system. `each(10)`
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
34 produces a schedule that returns true whenever a multiple of 10 is in
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
35 the time range. `at(17, 153)` produces one that returns true when 17
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
36 or 143 is in the time range. Schedules can be combined and negated,
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
37 e.g. `each(10) & ~at(20, 30)` (execute at each 10, except at 20 and
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
38 30). So that gives a lot of flexibility as to when you want to do
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
39 things.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
40
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
41 Timeline
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
42 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
43
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
44 This would be a string indicating on what "timeline" the schedule is
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
45 supposed to operate. For instance, there could be a "real time"
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
46 timeline, an "algorithm time" timeline, an "iterations" timeline, a
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
47 "number of examples" timeline, and so on. This means you can schedule
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
48 some action to be executed every actual second, or every second of
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
49 training time (ignoring time spent executing plugins), or every
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
50 discrete iteration, or every n examples processed. This might be a
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
51 bloat feature (it was an afterthought to my original design, anyway),
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
52 but I think that there are circumstances where each of these options
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
53 is the best one.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
54
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
55 Function
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
56 ========
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
57
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
58 The plugin function would receive some object containing the time
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
59 range, a flag indicating whether the training has started, a flag
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
60 indicating whether the training is done (which they can set in order
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
61 to stop training), as well as anything pertinent about the model.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
62
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
63 Implementation
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
64 ==============
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
65
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
66 I have implemented the feature in plugin.py, in this directory. Simply
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
67 run python plugin.py to test it.
8cc324f388ba proposal for a plugin system
Olivier Breuleux <breuleuo@iro.umontreal.ca>
parents:
diff changeset
68