Mercurial > pylearn
annotate doc/v2_planning/architecture_NB.txt @ 1474:a57f4839a9d8
merge
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 18 May 2011 10:52:42 -0400 |
parents | d9f93923765f |
children |
rev | line source |
---|---|
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
1 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
2 Here is how I think how the Pylearn library could be organized simply and |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
3 efficiently. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
4 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
5 We said the main goals for a library are: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
6 1. Easily connect new learners with new datasets |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
7 2. Easily build new formula-based learners |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
8 3. Have "hyper" learning facilities such as hyper optimization, model selection, |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
9 experiments design, etc. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
10 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
11 We should focus on those features. They are 80% of our use cases and the other |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
12 20% will always comprise new developments which should not be predictable. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
13 Focusing on the 80% is relatively simple and implementation could be done in a |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
14 matter of weeks. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
15 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
16 Let's say we have a DBN learner and we want to plan ahead for possible |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
17 modifications and decompose it in small "usable" chunks. When a new student |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
18 wants to modify the learning procedure, we envisioned either: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
19 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
20 1. A pre-made hyper-learning graph of a DBN that he can "conveniently" adapt to |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
21 his need |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
23 2. A hooks or messages system that allows custom actions at various set points |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
24 in the file (pre-defined but can also be "easily" added) |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
25 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
26 However, consider that it is CODE that he wants to modify. Intricate details of |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
27 new learning algorithms possibly include modifying ANY parts of the code, adding |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
28 loops, changing algorithms, etc. There are two well time-tested methods for |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
29 dealing with this: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
30 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
31 1. Change the code. Add a new parameter that optionnally does the job. OR, if |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
32 changes are substantial: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
33 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
34 2. Copy the DBN code, modify and save your forked version of it. Each learner |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
35 or significantly new experiment should have its own file. We should not try to |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
36 generalize what is not generalizable. In other words, small loops and |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
37 mini-algorithms inside learners may not be worthy of being encapsulated. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
38 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
39 Based on the above three main goals, two objects need well-defined |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
40 encapsulation: datasets and learners. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
41 (Visualization should be included in the learners. The hard part is not the |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
42 print or pylab.plot statements, it's the statistics gathering.) |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
43 Here is the basic interface we talked about, and how we would work out some |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
44 special cases. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
45 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
46 Datasets: fetch mini-batches as numpy arrays in the usual format. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
47 Learners: "standalone" interface: a train function that includes optional |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
48 visualization, "advanced" interface for more control: adapt and predict |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
49 functions. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
50 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
51 - K-fold cross-validation? Write a generic "hyper"-learner that does this for |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
52 arbitrary learners via their "advanced" interface. ... and if multiple |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
53 similar datasets can be learned more efficiently for a particular learner? |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
54 Include an option inside the learner to cross-validate. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
55 - Optimizers? Have a generic "Theano formula"-based learner for each optimizer |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
56 you want (SGD, momentum, delta-bar-delta, etc.). Of course combine similar |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
57 optimizers with compatible parameters. A set of helper functions should also |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
58 be provided for building the actual Theano formula. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
59 - Early stopping? This has to be included inside the train function for each |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
60 learner where applicable (probably only the formula-based generic ones anyway) |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
61 - Generic hyper parameters optimizer? Write a generic hyper-learner that does |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
62 this. And a simple "grid" one. Require supported learners to provide the |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
63 list/distribution of their applicable hyper-parameters which will be supplied |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
64 to their constructor at the hyper-learner discretion. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
65 - Visualization? Each learner defines what can be visualized and how. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
66 - Early stopping curves? The early stopping learner optionally shows this. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
67 - Complex hyper-parameters 2D-subsets curves? Add this as an option in the |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
68 hyper-parameter optimizer. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
69 - Want a dataset that sits in RAM? Write a custom class that still outputs numpy |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
70 arrays in usual format. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
71 - Want an infinite auto-generated dataset? Write a custom class that generates |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
72 and outputs numpy arrays on the fly. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
73 - Dealing with time series with multi-dimensional input? This requires |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
74 cooperation between learner and dataset. Use 3-dimensional numpy arrays. Write |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
75 dataset that outputs these and learner that understands it. OR write dataset |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
76 that converts to one-dimensional input and use any learner. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
77 - Sophisticated performance evaluation function? This evaluation function should |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
78 be suppliable to every learner. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
79 - Have a multi-steps complex learning procedure using gradient-based learning in |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
80 some steps? Write a "hyper"-learner that successively calls formula-based |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
81 learners and directly accesses the weights member variables for |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
82 initializations of subsequent learners. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
83 - Want to combine early stopping curves for many hyper-parameter values? Modify |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
84 the optimization-based learners to save the early stopping curve as a member |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
85 variable and use this in the hyper-parameter learner visualization routine. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
86 - Curriculum learning? This requires cooperation between learner and dataset. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
87 Require supported datasets to understand a function call "set_experience" or |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
88 anything you decide. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
89 - Filters visualization on selected best hyper-parameters set? Include code in |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
90 the formula-based learners to look for the weights applied on input and |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
91 activate visualization in hyper-learner only for the chosen hyper-parameters. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
92 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
93 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
94 >> to demonstrate architecture designs on kfold dbn training - how would you |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
95 >> propose that the library help to do that? |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
96 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
97 By providing a K-fold cross-validation generic "hyper"-learner that controls an |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
98 arbitrary learner via their advanced interface (train, adapt) and their exposed |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
99 hyper-parameters which would be fixed on the behalf of the user. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
100 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
101 JB asks: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
102 What interface should the learner expose in order for the hyper-parameter to |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
103 be generic (work for many/most/all learners) |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
104 |
1227 | 105 NB: In the case of a K-fold hyper-learner, I would expect the user to |
106 completely specify the hyper-parameters and the hyper-learner could just | |
107 blindly pass them along to the sub-learner. For more complex hyper-learners | |
108 like hyper-optimizer or hyper-grid we would require supported sub-learners | |
109 to define a function "get_hyperparam" that returns a | |
110 dict(name1: [default, range], name2: ...). These hyper-parameters are | |
111 supplied to the learner constructor. | |
112 | |
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
113 This K-fold learner, since it is generic, would work by launching multiple |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
114 experiments and would support doing so in parallel inside of a job (python MPI |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
115 ?) or by launching on the cluster multiple owned scripts that write results on |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
116 disk in the way specified by the K-fold learner. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
117 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
118 JB asks: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
119 This is not technically possible if the worker nodes and the master node do |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
120 not all share a filesystem. There is a soft requirement that the library |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
121 support this so that we can do job control from DIRO without messing around |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
122 with colosse, mammouth, condor, angel, etc. all separately. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
123 |
1227 | 124 NB: The hyper-learner would have to support launching jobs on remote servers |
125 via ssh. Common functionality for this could of course be reused between | |
126 different hyper-learners. | |
127 | |
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
128 JB asks: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
129 The format used to communicate results from 'learner' jobs with the kfold loop |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
130 and with the stats collectors, and the experiment visualization code is not |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
131 obvious - any ideas how to handle this? |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
132 |
1227 | 133 NB: The DBN is responsible for saving/viewing results inside a DBN experiment. |
134 The hyper-learner controls DBN execution (even in a script on a remote | |
135 machine) and collects evaluation measurements after its dbn.predict call. | |
136 For K-fold it would typically just save the evaluation distribution and | |
137 average in whatever way (internal convention) that can be transfered over ssh. | |
138 The K-fold hyper-learner would only expose its train interface (no adapt, | |
139 predict) since it cannot always be decomposed in many steps depending on the | |
140 sublearner. | |
141 | |
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
142 The library would also have a DBN learner with flexible hyper-parameters that |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
143 control its detailed architecture. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
144 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
145 JB asks: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
146 What kind of building blocks should make this possible - how much flexibility |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
147 and what kinds are permitted? |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
148 |
1227 | 149 NB: Things like number of layers, hidden units and any optional parameters |
150 that affect initialization or training (i.e. AE or RBM variant) that the DBN | |
151 developer can think of. The final user would have to specify those | |
152 hyper-parameters to the K-fold learner anyway. | |
153 | |
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
154 The interface of the provided dataset would have to conform to possible inputs |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
155 that the DBN module understands, i.e. by |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
156 default 2D numpy arrays. If more complex dataset needs arise, either subclass a |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
157 converter for the known format or add this functionality to the DBN learner |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
158 directly. Details of the DBN learner core would resemble the tutorials, would |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
159 typically be included in one straigthforward code file and could potentially use |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
160 "Theano-formula"-based learners as intermediate steps. |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
161 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
162 JB asks: |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
163 |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
164 One of the troubles with straightforward code is that it is neither easy to |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
165 stop and start (as in long-running jobs) nor control via a hyper-parameter |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
166 optimizer. So I don't think code in the style of the curren tutorials is very |
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
167 useful in the library. |
1227 | 168 |
169 NB: I could see how we could require all learners to define stop and restart | |
170 methods so they would be responsible to save and restore themselves. | |
171 A hyper-learner's stop and restart method would in addition call recursively | |
172 its subleaners' stop and restart methods. | |
1225
dbac4bd107d8
added architecture_NB
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
173 |