Mercurial > pylearn
changeset 1227:d9f93923765f
answers
author | boulanni <nicolas_boulanger@hotmail.com> |
---|---|
date | Wed, 22 Sep 2010 18:03:18 -0400 |
parents | 16919775479c |
children | 86d802226a97 |
files | doc/v2_planning/architecture_NB.txt |
diffstat | 1 files changed, 31 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/doc/v2_planning/architecture_NB.txt Wed Sep 22 17:17:52 2010 -0400 +++ b/doc/v2_planning/architecture_NB.txt Wed Sep 22 18:03:18 2010 -0400 @@ -102,6 +102,14 @@ What interface should the learner expose in order for the hyper-parameter to be generic (work for many/most/all learners) +NB: In the case of a K-fold hyper-learner, I would expect the user to + completely specify the hyper-parameters and the hyper-learner could just + blindly pass them along to the sub-learner. For more complex hyper-learners + like hyper-optimizer or hyper-grid we would require supported sub-learners + to define a function "get_hyperparam" that returns a + dict(name1: [default, range], name2: ...). These hyper-parameters are + supplied to the learner constructor. + This K-fold learner, since it is generic, would work by launching multiple experiments and would support doing so in parallel inside of a job (python MPI ?) or by launching on the cluster multiple owned scripts that write results on @@ -113,11 +121,24 @@ support this so that we can do job control from DIRO without messing around with colosse, mammouth, condor, angel, etc. all separately. +NB: The hyper-learner would have to support launching jobs on remote servers + via ssh. Common functionality for this could of course be reused between + different hyper-learners. + JB asks: The format used to communicate results from 'learner' jobs with the kfold loop and with the stats collectors, and the experiment visualization code is not obvious - any ideas how to handle this? +NB: The DBN is responsible for saving/viewing results inside a DBN experiment. + The hyper-learner controls DBN execution (even in a script on a remote + machine) and collects evaluation measurements after its dbn.predict call. + For K-fold it would typically just save the evaluation distribution and + average in whatever way (internal convention) that can be transfered over ssh. + The K-fold hyper-learner would only expose its train interface (no adapt, + predict) since it cannot always be decomposed in many steps depending on the + sublearner. + The library would also have a DBN learner with flexible hyper-parameters that control its detailed architecture. @@ -125,6 +146,11 @@ What kind of building blocks should make this possible - how much flexibility and what kinds are permitted? +NB: Things like number of layers, hidden units and any optional parameters + that affect initialization or training (i.e. AE or RBM variant) that the DBN + developer can think of. The final user would have to specify those + hyper-parameters to the K-fold learner anyway. + The interface of the provided dataset would have to conform to possible inputs that the DBN module understands, i.e. by default 2D numpy arrays. If more complex dataset needs arise, either subclass a @@ -139,4 +165,9 @@ stop and start (as in long-running jobs) nor control via a hyper-parameter optimizer. So I don't think code in the style of the curren tutorials is very useful in the library. + +NB: I could see how we could require all learners to define stop and restart + methods so they would be responsible to save and restore themselves. + A hyper-learner's stop and restart method would in addition call recursively + its subleaners' stop and restart methods.