Mercurial > pylearn
annotate doc/v2_planning/plugin_architecture_GD.txt @ 1207:53937045f6c7
Pasted content of email sent by Ian about existing python ML libraries
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Tue, 21 Sep 2010 10:58:14 -0400 |
parents | 9ff2242a817b |
children |
rev | line source |
---|---|
1139 | 1 Overview |
2 ======== | |
3 | |
4 The "central authority" (CA) is the glue which takes care of interfacing plugins | |
5 with one another. It has 3 basic roles: | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
6 |
1139 | 7 * it maintains a list of "registered" or "active" plugins |
8 * it receives and queues the various messages sent by the plugins | |
9 * dispatches the messages to the recipient, based on various "events" | |
10 | |
11 Events can take different forms: | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
12 |
1139 | 13 * the CA can trigger various events based on running time |
14 * can be linked to messages emitted by the various plugins. Events can be | |
15 triggered based on the frequency of such messages. | |
16 * Once an event is triggered, it is relayed to the appropriate "recipient | |
17 plugin(s)" | |
18 | |
19 It is the responsibility of each plugin to inform the CA of which "events" it | |
20 cares about. | |
21 | |
22 | |
23 Generic Pseudo-code | |
24 =================== | |
25 | |
26 I'll try to write this in pseudo-python as best I can. I'll do this in | |
27 traditional OOP, as this is what I'm more comfortable with. I'll leave it up to | |
28 James and OB to python-ize this :) | |
29 | |
30 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
31 .. code-block:: python |
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
32 |
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
33 class MessageX(Message): |
1139 | 34 """ |
35 A message is basically a data container. This could very well be replaced by | |
36 a generic Python object. | |
37 """ | |
38 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
39 class Plugin(object): |
1139 | 40 """ |
41 The base plugin object doesn't do much. It contains a reference to the CA | |
42 (upon plugin being registered with the CA), provides boilerplate code | |
43 for storing which "events" this plugin is susceptible to, as well as code | |
44 for registering callback functions for the various messages. | |
45 """ | |
46 | |
47 CA = None # to be initialized upon plugin registration | |
48 active_msg = {} # dictionary of messages this plugin is susceptible to | |
49 callbacks = {} # mapping of message class names --> callback function | |
50 | |
51 def listen(msg_class, interval): | |
52 """ | |
53 :param msg_class: reference to the "message" class we are interested in. | |
54 These messages will be forwarded to this plugin, when | |
55 the trigger condition is met. | |
56 :param interval: integer. Forward the message to this plugin every 'interval' | |
57 such messages. | |
58 """ | |
59 self.active_msg[msg_class] = interval | |
60 | |
61 | |
62 def check_trigger(msg_class, time): | |
63 """ | |
64 Checks whether or not the "trigger" condition associated with message of | |
65 class 'msg_class' is satisfied or not. This could be the default | |
66 behavior, and be overridden by the various plugins. | |
67 """ | |
68 return time % self.active_msg[msg_class] == 0 | |
69 | |
70 | |
71 def handler(msg_class, callback): | |
72 """ | |
73 Decorator which registers a callback function for the given message | |
74 type. | |
75 | |
76 NOTE: I don't think what I wrote would work as a Python decorator. I am | |
77 not sure how to handle decoraters with multiple parameters (one | |
78 explicit, and the other as the reference to the function). I'm pretty | |
79 sure James or OB could figure it out though ! | |
80 | |
81 :params msg_class: reference to the message class for which we are | |
82 registering a callback function | |
83 :params callback : reference to which function to call for a given message | |
84 """ | |
85 | |
86 self.callbacks[msg_class] = callback | |
87 | |
88 | |
89 def execute(self, message): | |
90 """ | |
91 Boiler-plate code which executes the right callback function, for the | |
92 given message type. | |
93 """ | |
94 for (msg_class, callback) in self.callbacks.iteritems(): | |
95 if message.__class__ == msg_class: | |
96 callback(message) | |
97 | |
98 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
99 class ProducerPlugin(Plugin): |
1139 | 100 |
101 def dostuff(): | |
102 """ | |
103 A typical "producer" plugin. It basically performs an arbitrary action | |
104 and asks the CA to forward the results (in the form of a message) to | |
105 other plugins. | |
106 """ | |
107 | |
108 # iteratively do stuff and relay messages to other plugins | |
109 while(condition): | |
110 | |
111 msga = # do something | |
112 ca.send(msga) # ask CA to forward to other plugins | |
113 | |
114 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
115 class ConsumerPlugin(Plugin): |
1139 | 116 |
117 @handler(MessageA) | |
118 def func(msga): | |
119 """ | |
120 A consumer or "passive plugin" (eg. logger, etc). This function is | |
121 register as being the callback function for Message A objects. | |
122 """ | |
123 # do something with message A | |
124 | |
125 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
126 class ConsumerProducerPlugin(Plugin): |
1139 | 127 |
128 @handler(MessageA) | |
129 def func(msga): | |
130 """ | |
131 Example of a consumer / producer plugin. It receives MessageA messages, | |
132 processes the data, then asks the CA to send a new message (MessageB) as | |
133 the result of its computation. The CA will automatically forward to all | |
134 interested parties. | |
135 | |
136 :param msga: MessageA instance | |
137 """ | |
138 | |
139 data = dostuff(msga) # process message | |
140 msgb = MessageB(data) # generate new message for other plugins | |
141 ca.send(msgb) # ask CA to forward to other plugins | |
142 | |
143 | |
144 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
145 class CentralAuthority(object): |
1139 | 146 |
147 active_plugins = [] # contains a list of registered plugins | |
148 | |
149 mailmain = {} # dictionary which contains, for each message class, a | |
150 # list of plugins interested in this message | |
151 | |
152 event_count = {} # dictionary of "event" counts for various messages | |
153 | |
154 def register(plugin): | |
155 """ | |
156 Registers the plugin and adds it as a listener for the various messages | |
157 it is interested in. | |
158 :param plugin: plugin instance which we want to "activate" | |
159 """ | |
160 | |
161 # each plugin must have a reference to the CA | |
162 plugin.ca = self | |
163 | |
164 # maintain list of active plugins | |
165 active_plugins.append(plugin) | |
166 | |
167 # remember which messages this plugin cares about | |
168 for msg in plugin.active_msg.keys(): | |
169 self.mailman[msg].append(plugin) | |
170 self.event_count[msg] = 0 | |
171 | |
172 def send(msg): | |
173 """ | |
174 This function relays the message to the appropriate plugins, based on | |
175 their "trigger" condition. It also keeps track of the number of times | |
176 this event was raised. | |
177 | |
178 :param msg: message instance | |
179 """ | |
180 | |
181 event_count[msg.__class__] += 1 | |
182 | |
183 # for all plugins interested in this message ... | |
184 for plugin in self.mailman[msg.__class__]: | |
185 | |
186 # check if trigger condition is met | |
187 if plugin.check_trigger(msg, self.event_count[msg.__class__]): | |
188 | |
189 # have the plugin execute the message | |
190 plugin.execute(msg) | |
191 | |
192 | |
193 def run(self): | |
194 """ | |
195 This would be the main loop of the program. I won't go into details | |
196 because its still somewhat blurry in my head :) But basically, the CA | |
197 could be configured to send out its own messages, independently from all | |
198 other plugins. | |
199 | |
200 These could be "synchronous" messages such as: "5 seconds have passed", | |
201 or others such as "save state we are about to get killed". | |
202 | |
203 NOTE: seems like this would almost have to live in its own thread ... | |
204 """ | |
205 | |
206 # the following would be parametrized obviously | |
207 while(True): | |
208 msg = ElapsedTimeMessage(5) | |
209 self.send(msg) | |
210 sleep(5) | |
211 | |
212 | |
213 | |
214 Putting it all-together | |
215 ======================= | |
216 | |
217 | |
1190
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
218 .. code-block:: python |
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
219 |
9ff2242a817b
fix rst syntax errors/warnings
Frederic Bastien <nouiz@nouiz.org>
parents:
1139
diff
changeset
|
220 def main(): |
1139 | 221 |
222 ca = CentralAuthority() | |
223 | |
224 producer = ProducerPlugin() | |
225 ca.register(producer) | |
226 | |
227 consumer = ConsumerPlugin() | |
228 consumer.listen(MessageB, 1) | |
229 ca.register(consumer)) | |
230 | |
231 other = ConsumerProducerPlugin() | |
232 other.listen(MessageB, 10) | |
233 ca.register(other) | |
234 | |
235 # this is the function call which gets the ball rolling | |
236 producer.dostuff() | |
237 | |
238 | |
239 DISCUSSION: blocking vs. non-blocking | |
240 ===================================== | |
241 | |
242 In the above example, I used "blocking" sends. However it is not-clear that this | |
243 is the best option. | |
244 | |
245 In the example, the producer basically acts as the main loop. It relinquishes | |
246 control of the main loop when the CA decides to forward the message to other | |
247 plugins. Control will only be returned once the cascade of send/receives | |
248 initiated with MessageA is complete (all subplugins have processed MessageA and | |
249 any messages sent as a side-effect have also been processed). | |
250 | |
251 This definitely imposes constraints on what the plugins can do, and how they do | |
252 it. For the type of single-processor / linear jobs we tend to run, this might be | |
253 enough (??). | |
254 | |
255 The good news is that going forward, the above plugin architecture can also | |
256 scale to distributed systems, by changing the sends to be non-blocking. Plugins | |
257 could then live on different machines and process data as they see fit. | |
258 Synchronization would be enforced by way of messages. In the above, the "main | |
259 producer" would thus become a consumer/producer who listens for "done processing | |
260 MessageA" messages and produces a new MessageA as a result. | |
261 | |
262 On single-processor systems, the synchronization overhead might be too costly | |
263 however. That is something we would have to investigate. On the plus side | |
264 however, our plugins would be "future proof" and lend themselves well to the | |
265 type of "massively parallel jobs" we wish to run (i.e. meta-learners, etc.) | |
266 | |
267 | |
268 | |
269 Logistic Regression | |
270 =================== | |
271 | |
272 | |
273 TO COME SOON (?) |