Mercurial > pylearn

--- a/doc/v2_planning/architecture_NB.txt	Wed Sep 22 17:17:52 2010 -0400
+++ b/doc/v2_planning/architecture_NB.txt	Wed Sep 22 18:03:18 2010 -0400
@@ -102,6 +102,14 @@
   What interface should the learner expose in order for the hyper-parameter to
   be generic (work for many/most/all learners)

+NB: In the case of a K-fold hyper-learner, I would expect the user to
+  completely specify the hyper-parameters and the hyper-learner could just
+  blindly pass them along to the sub-learner. For more complex hyper-learners
+  like hyper-optimizer or hyper-grid we would require supported sub-learners
+  to define a function "get_hyperparam" that returns a
+  dict(name1: [default, range], name2: ...). These hyper-parameters are
+  supplied to the learner constructor.
+
 This K-fold learner, since it is generic, would work by launching multiple
 experiments and would support doing so in parallel inside of a job (python MPI
 ?) or by launching on the cluster multiple owned scripts that write results on
@@ -113,11 +121,24 @@
   support this so that we can do job control from DIRO without messing around
   with colosse, mammouth, condor, angel, etc. all separately.

+NB: The hyper-learner would have to support launching jobs on remote servers
+  via ssh. Common functionality for this could of course be reused between
+  different hyper-learners.
+
 JB asks:
   The format used to communicate results from 'learner' jobs with the kfold loop
   and with the stats collectors, and the experiment visualization code is not
   obvious - any ideas how to handle this?

+NB: The DBN is responsible for saving/viewing results inside a DBN experiment.
+  The hyper-learner controls DBN execution (even in a script on a remote
+  machine) and collects evaluation measurements after its dbn.predict call.
+  For K-fold it would typically just save the evaluation distribution and
+  average in whatever way (internal convention) that can be transfered over ssh.
+  The K-fold hyper-learner would only expose its train interface (no adapt,
+  predict) since it cannot always be decomposed in many steps depending on the
+  sublearner.
+
 The library would also have a DBN learner with flexible hyper-parameters that
 control its detailed architecture.

@@ -125,6 +146,11 @@
   What kind of building blocks should make this possible - how much flexibility
   and what kinds are permitted?

+NB: Things like number of layers, hidden units and any optional parameters
+  that affect initialization or training (i.e. AE or RBM variant) that the DBN
+  developer can think of. The final user would have to specify those
+  hyper-parameters to the K-fold learner anyway.
+
 The interface of the provided dataset would have to conform to possible inputs
 that the DBN module understands, i.e. by
 default 2D numpy arrays. If more complex dataset needs arise, either subclass a
@@ -139,4 +165,9 @@
   stop and start (as in long-running jobs) nor control via a hyper-parameter
   optimizer.  So I don't think code in the style of the curren tutorials is very
   useful in the library.
+
+NB: I could see how we could require all learners to define stop and restart
+  methods so they would be responsible to save and restore themselves.
+  A hyper-learner's stop and restart method would in addition call recursively
+  its subleaners' stop and restart methods.