Mercurial > pylearn
comparison doc/v2_planning/learner.txt @ 1052:84f62533e7a8
v2planning learner - reply to comments
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 08 Sep 2010 15:43:32 -0400 |
parents | f1732269bce8 |
children | 390166ace9e5 |
comparison
equal
deleted
inserted
replaced
1051:bc246542d6ff | 1052:84f62533e7a8 |
---|---|
266 not good to say that the Learner instance *is* the node because (a) learner instances change | 266 not good to say that the Learner instance *is* the node because (a) learner instances change |
267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a | 267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a |
268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills | 268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills |
269 over into other committees, so we should get their feedback about how to resolve it. | 269 over into other committees, so we should get their feedback about how to resolve it. |
270 | 270 |
271 Comment by OD | 271 Comments |
272 ~~~~~~~~~~~~~ | 272 ~~~~~~~~ |
273 (I hope it's ok to leave comments even though I'm not in committee... I'm | 273 OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm |
274 interested to see how the learner interface is shaping up so I'll be keeping | 274 interested to see how the learner interface is shaping up so I'll be keeping |
275 an eye on this file) | 275 an eye on this file) |
276 I'm wondering what's the benefit of such an API compared to simply defining a | 276 I'm wondering what's the benefit of such an API compared to simply defining a |
277 new method for each instruction. It seems to me that typically, the 'execute' | 277 new method for each instruction. It seems to me that typically, the 'execute' |
278 method would end up being something like | 278 method would end up being something like |
282 self.do_y(..) | 282 self.do_y(..) |
283 ... | 283 ... |
284 so why not directly call do_x / do_y instead? | 284 so why not directly call do_x / do_y instead? |
285 | 285 |
286 | 286 |
287 Comment by RP | 287 JB replies: I agree with you, and in the implementation of a Learner I suggest |
288 ~~~~~~~~~~~~~ | 288 using Python decorators to get the best of both worlds: |
289 | 289 |
290 James correct me if I'm wrong, but I think each instruction has a execute | 290 class NNet(Learner): |
291 | |
292 ... | |
293 | |
294 @Instruction.new(arg_types=(Float(min=-8, max=-1, default=-4),)) | |
295 def set_log_lr(self, log_lr): | |
296 self.lr.value = numpy.exp(log_lr) | |
297 | |
298 ... | |
299 | |
300 The Learner base class can implement a instruction_set() that walks through the | |
301 methods of 'self' and pick out the ones that have corresponding instructions. | |
302 But anyone can call the method normally. The NNet class can also have methods | |
303 that are not instructions. | |
304 | |
305 | |
306 | |
307 RP asks: James correct me if I'm wrong, but I think each instruction has a execute | |
291 command. The job of the learner is to traverse the graph and for each edge | 308 command. The job of the learner is to traverse the graph and for each edge |
292 that it decides to cross to call the execute of that edge. Maybe James has | 309 that it decides to cross to call the execute of that edge. Maybe James has |
293 something else in mind, but this was my understanding. | 310 something else in mind, but this was my understanding. |
311 | |
312 JB replies: close, but let me make a bit of a clarification. The job of a | |
313 Learner is simply to implement the API of a Learner - to list what edges are | |
314 available and to be able to cross them if asked. The code *using* the Learner | |
315 (client) decides which edges to cross. The client may also be a Learner, but | |
316 maybe not. | |
294 | 317 |
295 | 318 |
296 | 319 |
297 Just another view/spin on the same idea (Razvan) | 320 Just another view/spin on the same idea (Razvan) |
298 ================================================ | 321 ================================================ |
398 united. What I would see in my case to have this functionality is something | 421 united. What I would see in my case to have this functionality is something |
399 similar to the lazy linker for Theano. | 422 similar to the lazy linker for Theano. |
400 | 423 |
401 | 424 |
402 | 425 |
403 | 426 JB asks: There is definitely a strong theme of graphs in both suggestions, |
427 furthermore graphs that have heavy-duty nodes and light-weight edges. But I | |
428 don't necessarily think that we're proposing the same thing. One difference is | |
429 that the graph I talked about would be infinite in most cases of interest, so | |
430 it's not going to be representable by Theano's data structures (even with lazy | |
431 if). Another difference is that the graph I had in mind doesn't feel fractal - | |
432 it would be very common for a graph edge to be atomic. A proxy pattern, such as | |
433 in a hyper-learner would create a notion of being able to zoom in, but other | |
434 than that, i'm not sure what you mean. | |
435 |