diff doc/v2_planning/learner.txt @ 1052:84f62533e7a8

v2planning learner - reply to comments
author James Bergstra <bergstrj@iro.umontreal.ca>
date Wed, 08 Sep 2010 15:43:32 -0400
parents f1732269bce8
children 390166ace9e5
line wrap: on
line diff
--- a/doc/v2_planning/learner.txt	Wed Sep 08 15:39:51 2010 -0400
+++ b/doc/v2_planning/learner.txt	Wed Sep 08 15:43:32 2010 -0400
@@ -268,9 +268,9 @@
 whole saved model just to attach meta-info e.g. validation score.    Choosing this API spills
 over into other committees, so we should get their feedback about how to resolve it.
 
-Comment by OD
-~~~~~~~~~~~~~
-(I hope it's ok to leave comments even though I'm not in committee... I'm
+Comments
+~~~~~~~~
+OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm
 interested to see how the learner interface is shaping up so I'll be keeping
 an eye on this file)
 I'm wondering what's the benefit of such an API compared to simply defining a
@@ -284,14 +284,37 @@
 so why not directly call do_x / do_y instead?
 
 
-Comment by RP
-~~~~~~~~~~~~~
+JB replies: I agree with you, and in the implementation of a Learner I suggest
+using Python decorators to get the best of both worlds:
+    
+    class NNet(Learner):
+
+        ...
 
-James correct me if I'm wrong, but I think each instruction has a execute
+        @Instruction.new(arg_types=(Float(min=-8, max=-1, default=-4),))
+        def set_log_lr(self, log_lr):
+            self.lr.value = numpy.exp(log_lr)          
+  
+        ...
+
+The Learner base class can implement a instruction_set() that walks through the
+methods of 'self' and pick out the ones that have corresponding instructions.
+But anyone can call the method normally.  The NNet class can also have methods
+that are not instructions.
+
+
+
+RP asks: James correct me if I'm wrong, but I think each instruction has a execute
 command. The job of the learner is to traverse the graph and for each edge
 that it decides to cross to call the execute of that edge. Maybe James has 
 something else in mind, but this was my understanding.
 
+JB replies: close, but let me make a bit of a clarification.  The job of a
+Learner is simply to implement the API of a Learner - to list what edges are
+available and to be able to cross them if asked. The code *using* the Learner
+(client) decides which edges to cross.  The client may also be a Learner, but
+maybe not.
+
 
 
 Just another view/spin on the same idea (Razvan)
@@ -400,4 +423,13 @@
  
 
 
+JB asks: There is definitely a strong theme of graphs in both suggestions,
+furthermore graphs that have heavy-duty nodes and light-weight edges.  But I
+don't necessarily think that we're proposing the same thing.  One difference is
+that the graph I talked about would be infinite in most cases of interest, so
+it's not going to be representable by Theano's data structures (even with lazy
+if).  Another difference is that the graph I had in mind doesn't feel fractal -
+it would be very common for a graph edge to be atomic.  A proxy pattern, such as
+in a hyper-learner would create a notion of being able to zoom in, but other
+than that, i'm not sure what you mean.