# HG changeset patch # User James Bergstra # Date 1283975012 14400 # Node ID 84f62533e7a88adb220159cecf960e9debd86527 # Parent bc246542d6ff8dd4f7213c36f980bd5953d6fff4 v2planning learner - reply to comments diff -r bc246542d6ff -r 84f62533e7a8 doc/v2_planning/learner.txt --- a/doc/v2_planning/learner.txt Wed Sep 08 15:39:51 2010 -0400 +++ b/doc/v2_planning/learner.txt Wed Sep 08 15:43:32 2010 -0400 @@ -268,9 +268,9 @@ whole saved model just to attach meta-info e.g. validation score. Choosing this API spills over into other committees, so we should get their feedback about how to resolve it. -Comment by OD -~~~~~~~~~~~~~ -(I hope it's ok to leave comments even though I'm not in committee... I'm +Comments +~~~~~~~~ +OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm interested to see how the learner interface is shaping up so I'll be keeping an eye on this file) I'm wondering what's the benefit of such an API compared to simply defining a @@ -284,14 +284,37 @@ so why not directly call do_x / do_y instead? -Comment by RP -~~~~~~~~~~~~~ +JB replies: I agree with you, and in the implementation of a Learner I suggest +using Python decorators to get the best of both worlds: + + class NNet(Learner): + + ... -James correct me if I'm wrong, but I think each instruction has a execute + @Instruction.new(arg_types=(Float(min=-8, max=-1, default=-4),)) + def set_log_lr(self, log_lr): + self.lr.value = numpy.exp(log_lr) + + ... + +The Learner base class can implement a instruction_set() that walks through the +methods of 'self' and pick out the ones that have corresponding instructions. +But anyone can call the method normally. The NNet class can also have methods +that are not instructions. + + + +RP asks: James correct me if I'm wrong, but I think each instruction has a execute command. The job of the learner is to traverse the graph and for each edge that it decides to cross to call the execute of that edge. Maybe James has something else in mind, but this was my understanding. +JB replies: close, but let me make a bit of a clarification. The job of a +Learner is simply to implement the API of a Learner - to list what edges are +available and to be able to cross them if asked. The code *using* the Learner +(client) decides which edges to cross. The client may also be a Learner, but +maybe not. + Just another view/spin on the same idea (Razvan) @@ -400,4 +423,13 @@ +JB asks: There is definitely a strong theme of graphs in both suggestions, +furthermore graphs that have heavy-duty nodes and light-weight edges. But I +don't necessarily think that we're proposing the same thing. One difference is +that the graph I talked about would be infinite in most cases of interest, so +it's not going to be representable by Theano's data structures (even with lazy +if). Another difference is that the graph I had in mind doesn't feel fractal - +it would be very common for a graph edge to be atomic. A proxy pattern, such as +in a hyper-learner would create a notion of being able to zoom in, but other +than that, i'm not sure what you mean.