# HG changeset patch # User boulanni # Date 1285192998 14400 # Node ID d9f93923765fd52f5ff468661ed708897d28327a # Parent 16919775479c5f9c09f4587a226237d2c505f949 answers diff -r 16919775479c -r d9f93923765f doc/v2_planning/architecture_NB.txt --- a/doc/v2_planning/architecture_NB.txt Wed Sep 22 17:17:52 2010 -0400 +++ b/doc/v2_planning/architecture_NB.txt Wed Sep 22 18:03:18 2010 -0400 @@ -102,6 +102,14 @@ What interface should the learner expose in order for the hyper-parameter to be generic (work for many/most/all learners) +NB: In the case of a K-fold hyper-learner, I would expect the user to + completely specify the hyper-parameters and the hyper-learner could just + blindly pass them along to the sub-learner. For more complex hyper-learners + like hyper-optimizer or hyper-grid we would require supported sub-learners + to define a function "get_hyperparam" that returns a + dict(name1: [default, range], name2: ...). These hyper-parameters are + supplied to the learner constructor. + This K-fold learner, since it is generic, would work by launching multiple experiments and would support doing so in parallel inside of a job (python MPI ?) or by launching on the cluster multiple owned scripts that write results on @@ -113,11 +121,24 @@ support this so that we can do job control from DIRO without messing around with colosse, mammouth, condor, angel, etc. all separately. +NB: The hyper-learner would have to support launching jobs on remote servers + via ssh. Common functionality for this could of course be reused between + different hyper-learners. + JB asks: The format used to communicate results from 'learner' jobs with the kfold loop and with the stats collectors, and the experiment visualization code is not obvious - any ideas how to handle this? +NB: The DBN is responsible for saving/viewing results inside a DBN experiment. + The hyper-learner controls DBN execution (even in a script on a remote + machine) and collects evaluation measurements after its dbn.predict call. + For K-fold it would typically just save the evaluation distribution and + average in whatever way (internal convention) that can be transfered over ssh. + The K-fold hyper-learner would only expose its train interface (no adapt, + predict) since it cannot always be decomposed in many steps depending on the + sublearner. + The library would also have a DBN learner with flexible hyper-parameters that control its detailed architecture. @@ -125,6 +146,11 @@ What kind of building blocks should make this possible - how much flexibility and what kinds are permitted? +NB: Things like number of layers, hidden units and any optional parameters + that affect initialization or training (i.e. AE or RBM variant) that the DBN + developer can think of. The final user would have to specify those + hyper-parameters to the K-fold learner anyway. + The interface of the provided dataset would have to conform to possible inputs that the DBN module understands, i.e. by default 2D numpy arrays. If more complex dataset needs arise, either subclass a @@ -139,4 +165,9 @@ stop and start (as in long-running jobs) nor control via a hyper-parameter optimizer. So I don't think code in the style of the curren tutorials is very useful in the library. + +NB: I could see how we could require all learners to define stop and restart + methods so they would be responsible to save and restore themselves. + A hyper-learner's stop and restart method would in addition call recursively + its subleaners' stop and restart methods.