pylearn: doc/v2_planning/architecture

annotate doc/v2_planning/architecture_NB.txt @ 1474:a57f4839a9d8

merge

author	James Bergstra <bergstrj@iro.umontreal.ca>
date	Wed, 18 May 2011 10:52:42 -0400
parents	d9f93923765f
children

rev	line source
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	1
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	2 Here is how I think how the Pylearn library could be organized simply and
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	3 efficiently.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	4
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	5 We said the main goals for a library are:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	6 1. Easily connect new learners with new datasets
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	7 2. Easily build new formula-based learners
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	8 3. Have "hyper" learning facilities such as hyper optimization, model selection,
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	9 experiments design, etc.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	10
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	11 We should focus on those features. They are 80% of our use cases and the other
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	12 20% will always comprise new developments which should not be predictable.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	13 Focusing on the 80% is relatively simple and implementation could be done in a
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	14 matter of weeks.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	15
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	16 Let's say we have a DBN learner and we want to plan ahead for possible
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	17 modifications and decompose it in small "usable" chunks. When a new student
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	18 wants to modify the learning procedure, we envisioned either:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	19
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	20 1. A pre-made hyper-learning graph of a DBN that he can "conveniently" adapt to
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	21 his need
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	22
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	23 2. A hooks or messages system that allows custom actions at various set points
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	24 in the file (pre-defined but can also be "easily" added)
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	25
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	26 However, consider that it is CODE that he wants to modify. Intricate details of
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	27 new learning algorithms possibly include modifying ANY parts of the code, adding
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	28 loops, changing algorithms, etc. There are two well time-tested methods for
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	29 dealing with this:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	30
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	31 1. Change the code. Add a new parameter that optionnally does the job. OR, if
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	32 changes are substantial:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	33
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	34 2. Copy the DBN code, modify and save your forked version of it. Each learner
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	35 or significantly new experiment should have its own file. We should not try to
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	36 generalize what is not generalizable. In other words, small loops and
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	37 mini-algorithms inside learners may not be worthy of being encapsulated.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	38
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	39 Based on the above three main goals, two objects need well-defined
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	40 encapsulation: datasets and learners.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	41 (Visualization should be included in the learners. The hard part is not the
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	42 print or pylab.plot statements, it's the statistics gathering.)
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	43 Here is the basic interface we talked about, and how we would work out some
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	44 special cases.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	45
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	46 Datasets: fetch mini-batches as numpy arrays in the usual format.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	47 Learners: "standalone" interface: a train function that includes optional
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	48 visualization, "advanced" interface for more control: adapt and predict
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	49 functions.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	50
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	51 - K-fold cross-validation? Write a generic "hyper"-learner that does this for
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	52 arbitrary learners via their "advanced" interface. ... and if multiple
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	53 similar datasets can be learned more efficiently for a particular learner?
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	54 Include an option inside the learner to cross-validate.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	55 - Optimizers? Have a generic "Theano formula"-based learner for each optimizer
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	56 you want (SGD, momentum, delta-bar-delta, etc.). Of course combine similar
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	57 optimizers with compatible parameters. A set of helper functions should also
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	58 be provided for building the actual Theano formula.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	59 - Early stopping? This has to be included inside the train function for each
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	60 learner where applicable (probably only the formula-based generic ones anyway)
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	61 - Generic hyper parameters optimizer? Write a generic hyper-learner that does
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	62 this. And a simple "grid" one. Require supported learners to provide the
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	63 list/distribution of their applicable hyper-parameters which will be supplied
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	64 to their constructor at the hyper-learner discretion.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	65 - Visualization? Each learner defines what can be visualized and how.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	66 - Early stopping curves? The early stopping learner optionally shows this.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	67 - Complex hyper-parameters 2D-subsets curves? Add this as an option in the
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	68 hyper-parameter optimizer.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	69 - Want a dataset that sits in RAM? Write a custom class that still outputs numpy
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	70 arrays in usual format.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	71 - Want an infinite auto-generated dataset? Write a custom class that generates
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	72 and outputs numpy arrays on the fly.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	73 - Dealing with time series with multi-dimensional input? This requires
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	74 cooperation between learner and dataset. Use 3-dimensional numpy arrays. Write
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	75 dataset that outputs these and learner that understands it. OR write dataset
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	76 that converts to one-dimensional input and use any learner.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	77 - Sophisticated performance evaluation function? This evaluation function should
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	78 be suppliable to every learner.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	79 - Have a multi-steps complex learning procedure using gradient-based learning in
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	80 some steps? Write a "hyper"-learner that successively calls formula-based
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	81 learners and directly accesses the weights member variables for
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	82 initializations of subsequent learners.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	83 - Want to combine early stopping curves for many hyper-parameter values? Modify
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	84 the optimization-based learners to save the early stopping curve as a member
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	85 variable and use this in the hyper-parameter learner visualization routine.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	86 - Curriculum learning? This requires cooperation between learner and dataset.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	87 Require supported datasets to understand a function call "set_experience" or
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	88 anything you decide.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	89 - Filters visualization on selected best hyper-parameters set? Include code in
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	90 the formula-based learners to look for the weights applied on input and
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	91 activate visualization in hyper-learner only for the chosen hyper-parameters.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	92
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	93
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	94 >> to demonstrate architecture designs on kfold dbn training - how would you
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	95 >> propose that the library help to do that?
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	96
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	97 By providing a K-fold cross-validation generic "hyper"-learner that controls an
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	98 arbitrary learner via their advanced interface (train, adapt) and their exposed
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	99 hyper-parameters which would be fixed on the behalf of the user.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	100
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	101 JB asks:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	102 What interface should the learner expose in order for the hyper-parameter to
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	103 be generic (work for many/most/all learners)
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	104
1227 d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	105 NB: In the case of a K-fold hyper-learner, I would expect the user to
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	106 completely specify the hyper-parameters and the hyper-learner could just
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	107 blindly pass them along to the sub-learner. For more complex hyper-learners
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	108 like hyper-optimizer or hyper-grid we would require supported sub-learners
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	109 to define a function "get_hyperparam" that returns a
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	110 dict(name1: [default, range], name2: ...). These hyper-parameters are
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	111 supplied to the learner constructor.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	112
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	113 This K-fold learner, since it is generic, would work by launching multiple
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	114 experiments and would support doing so in parallel inside of a job (python MPI
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	115 ?) or by launching on the cluster multiple owned scripts that write results on
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	116 disk in the way specified by the K-fold learner.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	117
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	118 JB asks:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	119 This is not technically possible if the worker nodes and the master node do
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	120 not all share a filesystem. There is a soft requirement that the library
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	121 support this so that we can do job control from DIRO without messing around
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	122 with colosse, mammouth, condor, angel, etc. all separately.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	123
1227 d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	124 NB: The hyper-learner would have to support launching jobs on remote servers
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	125 via ssh. Common functionality for this could of course be reused between
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	126 different hyper-learners.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	127
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	128 JB asks:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	129 The format used to communicate results from 'learner' jobs with the kfold loop
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	130 and with the stats collectors, and the experiment visualization code is not
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	131 obvious - any ideas how to handle this?
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	132
1227 d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	133 NB: The DBN is responsible for saving/viewing results inside a DBN experiment.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	134 The hyper-learner controls DBN execution (even in a script on a remote
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	135 machine) and collects evaluation measurements after its dbn.predict call.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	136 For K-fold it would typically just save the evaluation distribution and
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	137 average in whatever way (internal convention) that can be transfered over ssh.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	138 The K-fold hyper-learner would only expose its train interface (no adapt,
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	139 predict) since it cannot always be decomposed in many steps depending on the
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	140 sublearner.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	141
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	142 The library would also have a DBN learner with flexible hyper-parameters that
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	143 control its detailed architecture.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	144
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	145 JB asks:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	146 What kind of building blocks should make this possible - how much flexibility
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	147 and what kinds are permitted?
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	148
1227 d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	149 NB: Things like number of layers, hidden units and any optional parameters
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	150 that affect initialization or training (i.e. AE or RBM variant) that the DBN
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	151 developer can think of. The final user would have to specify those
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	152 hyper-parameters to the K-fold learner anyway.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	153
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	154 The interface of the provided dataset would have to conform to possible inputs
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	155 that the DBN module understands, i.e. by
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	156 default 2D numpy arrays. If more complex dataset needs arise, either subclass a
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	157 converter for the known format or add this functionality to the DBN learner
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	158 directly. Details of the DBN learner core would resemble the tutorials, would
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	159 typically be included in one straigthforward code file and could potentially use
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	160 "Theano-formula"-based learners as intermediate steps.
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	161
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	162 JB asks:
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	163
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	164 One of the troubles with straightforward code is that it is neither easy to
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	165 stop and start (as in long-running jobs) nor control via a hyper-parameter
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	166 optimizer. So I don't think code in the style of the curren tutorials is very
dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	167 useful in the library.
1227 d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	168
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	169 NB: I could see how we could require all learners to define stop and restart
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	170 methods so they would be responsible to save and restore themselves.
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	171 A hyper-learner's stop and restart method would in addition call recursively
d9f93923765f answers boulanni <nicolas_boulanger@hotmail.com> parents: 1225 diff changeset	172 its subleaners' stop and restart methods.
1225 dbac4bd107d8 added architecture_NB James Bergstra <bergstrj@iro.umontreal.ca> parents: diff changeset	173

Mercurial > pylearn

annotate doc/v2_planning/architecture_NB.txt @ 1474:a57f4839a9d8