comparison doc/v2_planning/learner.txt @ 1052:84f62533e7a8

v2planning learner - reply to comments
author James Bergstra <bergstrj@iro.umontreal.ca>
date Wed, 08 Sep 2010 15:43:32 -0400
parents f1732269bce8
children 390166ace9e5
comparison
equal deleted inserted replaced
1051:bc246542d6ff 1052:84f62533e7a8
266 not good to say that the Learner instance *is* the node because (a) learner instances change 266 not good to say that the Learner instance *is* the node because (a) learner instances change
267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a 267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a
268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills 268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills
269 over into other committees, so we should get their feedback about how to resolve it. 269 over into other committees, so we should get their feedback about how to resolve it.
270 270
271 Comment by OD 271 Comments
272 ~~~~~~~~~~~~~ 272 ~~~~~~~~
273 (I hope it's ok to leave comments even though I'm not in committee... I'm 273 OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm
274 interested to see how the learner interface is shaping up so I'll be keeping 274 interested to see how the learner interface is shaping up so I'll be keeping
275 an eye on this file) 275 an eye on this file)
276 I'm wondering what's the benefit of such an API compared to simply defining a 276 I'm wondering what's the benefit of such an API compared to simply defining a
277 new method for each instruction. It seems to me that typically, the 'execute' 277 new method for each instruction. It seems to me that typically, the 'execute'
278 method would end up being something like 278 method would end up being something like
282 self.do_y(..) 282 self.do_y(..)
283 ... 283 ...
284 so why not directly call do_x / do_y instead? 284 so why not directly call do_x / do_y instead?
285 285
286 286
287 Comment by RP 287 JB replies: I agree with you, and in the implementation of a Learner I suggest
288 ~~~~~~~~~~~~~ 288 using Python decorators to get the best of both worlds:
289 289
290 James correct me if I'm wrong, but I think each instruction has a execute 290 class NNet(Learner):
291
292 ...
293
294 @Instruction.new(arg_types=(Float(min=-8, max=-1, default=-4),))
295 def set_log_lr(self, log_lr):
296 self.lr.value = numpy.exp(log_lr)
297
298 ...
299
300 The Learner base class can implement a instruction_set() that walks through the
301 methods of 'self' and pick out the ones that have corresponding instructions.
302 But anyone can call the method normally. The NNet class can also have methods
303 that are not instructions.
304
305
306
307 RP asks: James correct me if I'm wrong, but I think each instruction has a execute
291 command. The job of the learner is to traverse the graph and for each edge 308 command. The job of the learner is to traverse the graph and for each edge
292 that it decides to cross to call the execute of that edge. Maybe James has 309 that it decides to cross to call the execute of that edge. Maybe James has
293 something else in mind, but this was my understanding. 310 something else in mind, but this was my understanding.
311
312 JB replies: close, but let me make a bit of a clarification. The job of a
313 Learner is simply to implement the API of a Learner - to list what edges are
314 available and to be able to cross them if asked. The code *using* the Learner
315 (client) decides which edges to cross. The client may also be a Learner, but
316 maybe not.
294 317
295 318
296 319
297 Just another view/spin on the same idea (Razvan) 320 Just another view/spin on the same idea (Razvan)
298 ================================================ 321 ================================================
398 united. What I would see in my case to have this functionality is something 421 united. What I would see in my case to have this functionality is something
399 similar to the lazy linker for Theano. 422 similar to the lazy linker for Theano.
400 423
401 424
402 425
403 426 JB asks: There is definitely a strong theme of graphs in both suggestions,
427 furthermore graphs that have heavy-duty nodes and light-weight edges. But I
428 don't necessarily think that we're proposing the same thing. One difference is
429 that the graph I talked about would be infinite in most cases of interest, so
430 it's not going to be representable by Theano's data structures (even with lazy
431 if). Another difference is that the graph I had in mind doesn't feel fractal -
432 it would be very common for a graph edge to be atomic. A proxy pattern, such as
433 in a hyper-learner would create a notion of being able to zoom in, but other
434 than that, i'm not sure what you mean.
435