Mercurial > pylearn
comparison doc/v2_planning.txt @ 946:7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
author | Yoshua Bengio <bengioy@iro.umontreal.ca> |
---|---|
date | Wed, 11 Aug 2010 21:32:31 -0400 |
parents | cafa16bfc7df |
children | 216f4ce969b2 |
comparison
equal
deleted
inserted
replaced
945:cafa16bfc7df | 946:7c4504a4ce1a |
---|---|
66 Theano Symbolic Expressions for ML | 66 Theano Symbolic Expressions for ML |
67 ---------------------------------- | 67 ---------------------------------- |
68 | 68 |
69 We could make this a submodule of pylearn: ``pylearn.nnet``. | 69 We could make this a submodule of pylearn: ``pylearn.nnet``. |
70 | 70 |
71 Yoshua: I would use a different name, e.g., "pylearn.formulas" to emphasize that it is not just | |
72 about neural nets, and that this is a collection of formulas (expressions), rather than | |
73 completely self-contained classes for learners. We could have a "nnet.py" file for | |
74 neural nets, though. | |
75 | |
71 There are a number of ideas floating around for how to handle classes / | 76 There are a number of ideas floating around for how to handle classes / |
72 modules (LeDeepNet, pylearn.shared.layers, pynnet) so lets implement as much | 77 modules (LeDeepNet, pylearn.shared.layers, pynnet) so lets implement as much |
73 math as possible in global functions with no classes. There are no models in | 78 math as possible in global functions with no classes. There are no models in |
74 the wish list that require than a few vectors and matrices to parametrize. | 79 the wish list that require than a few vectors and matrices to parametrize. |
75 Global functions are more reusable than classes. | 80 Global functions are more reusable than classes. |
83 to example (whose type and nature depends on the dataset, it could for | 88 to example (whose type and nature depends on the dataset, it could for |
84 instance be an (image, label) pair). This interface permits iterating over | 89 instance be an (image, label) pair). This interface permits iterating over |
85 the dataset, shuffling the dataset, and splitting it into folds. For | 90 the dataset, shuffling the dataset, and splitting it into folds. For |
86 efficiency, it is nice if the dataset interface supports looking up several | 91 efficiency, it is nice if the dataset interface supports looking up several |
87 index values at once, because looking up many examples at once can sometimes | 92 index values at once, because looking up many examples at once can sometimes |
88 be faster than looking each one up in turn. | 93 be faster than looking each one up in turn. In particular, looking up |
94 a consecutive block of indices, or a slice, should be well supported. | |
89 | 95 |
90 Some datasets may not support random access (e.g. a random number stream) and | 96 Some datasets may not support random access (e.g. a random number stream) and |
91 that's fine if an exception is raised. The user will see a NotImplementedError | 97 that's fine if an exception is raised. The user will see a NotImplementedError |
92 or similar, and try something else. | 98 or similar, and try something else. We might want to have a way to test |
99 that a dataset is random-access or not without having to load an example. | |
93 | 100 |
94 | 101 |
95 A more intuitive interface for many datasets (or subsets) is to load them as | 102 A more intuitive interface for many datasets (or subsets) is to load them as |
96 matrices or lists of examples. This format is more convenient to work with at | 103 matrices or lists of examples. This format is more convenient to work with at |
97 an ipython shell, for example. It is not good to provide only the "dataset | 104 an ipython shell, for example. It is not good to provide only the "dataset |
115 defined implicitly by the contents of /data/lisa/data at DIRO, but it would be | 122 defined implicitly by the contents of /data/lisa/data at DIRO, but it would be |
116 better to document in pylearn what the contents of this folder should be as | 123 better to document in pylearn what the contents of this folder should be as |
117 much as possible. It should be possible to rebuild this tree from information | 124 much as possible. It should be possible to rebuild this tree from information |
118 found in pylearn. | 125 found in pylearn. |
119 | 126 |
127 Yoshua (about ideas proposed by Pascal Vincent a while ago): | |
128 | |
129 - we may want to distinguish between datasets and tasks: a task defines | |
130 not just the data but also things like what is the input and what is the | |
131 target (for supervised learning), and *importantly* a set of performance metrics | |
132 that make sense for this task (e.g. those used by papers solving a particular | |
133 task, or reported for a particular benchmark) | |
134 | |
135 - we should discuss about a few "standards" that datasets and tasks may comply to, such as | |
136 - "input" and "target" fields inside each example, for supervised or semi-supervised learning tasks | |
137 (with a convention for the semi-supervised case when only the input or only the target is observed) | |
138 - "input" for unsupervised learning | |
139 - conventions for missing-valued components inside input or target | |
140 - how examples that are sequences are treated (e.g. the input or the target is a sequence) | |
141 - how time-stamps are specified when appropriate (e.g., the sequences are asynchronous) | |
142 - how error metrics are specified | |
143 * example-level statistics (e.g. classification error) | |
144 * dataset-level statistics (e.g. ROC curve, mean and standard error of error) | |
120 | 145 |
121 | 146 |
122 Model Selection & Hyper-Parameter Optimization | 147 Model Selection & Hyper-Parameter Optimization |
123 ---------------------------------------------- | 148 ---------------------------------------------- |
124 | 149 |
129 the experiment to run and the hyper-parameter space to search. Then this | 154 the experiment to run and the hyper-parameter space to search. Then this |
130 application-driver would take control of scheduling jobs and running them on | 155 application-driver would take control of scheduling jobs and running them on |
131 various computers... I'm imagining a potentially ugly brute of a hack that's | 156 various computers... I'm imagining a potentially ugly brute of a hack that's |
132 not necessarily something we will want to expose at a low-level for reuse. | 157 not necessarily something we will want to expose at a low-level for reuse. |
133 | 158 |
159 Yoshua: We want both the library-defined driver that takes instructions about how to generate | |
160 new hyper-parameter combinations (e.g. implicitly providing a prior distribution from which | |
161 to sample them), and examples showing how to use it in typical cases. | |
162 Note that sometimes we just want to find the best configuration of hyper-parameters, | |
163 but sometimes we want to do more subtle analysis. Often a combination of both. | |
164 In this respect it could be useful for the user to define hyper-parameters over | |
165 which scientific questions are sought (e.g. depth of an architecture) vs | |
166 hyper-parameters that we would like to marginalize/maximize over (e.g. learning rate). | |
167 This can influence both the sampling of configurations (we want to make sure that all | |
168 combinations of question-driving hyper-parameters are covered) and the analysis | |
169 of results (we may be willing to estimate ANOVAs or averaging or quantiles over | |
170 the non-question-driving hyper-parameters). | |
134 | 171 |
135 Python scripts for common ML algorithms | 172 Python scripts for common ML algorithms |
136 --------------------------------------- | 173 --------------------------------------- |
137 | 174 |
138 The script aspect of this feature request makes me think that what would be | 175 The script aspect of this feature request makes me think that what would be |
139 good here is more tutorial-type scripts. And the existing tutorials could | 176 good here is more tutorial-type scripts. And the existing tutorials could |
140 potentially be rewritten to use some of the pylearn.nnet expressions. More | 177 potentially be rewritten to use some of the pylearn.nnet expressions. More |
141 tutorials / demos would be great. | 178 tutorials / demos would be great. |
142 | 179 |
180 Yoshua: agreed that we could write them as tutorials, but note how the | |
181 spirit would be different from the current deep learning tutorials: we would | |
182 not mind using library code as much as possible instead of trying to flatten | |
183 out everything in the interest of pedagogical simplicity. Instead, these | |
184 tutorials should be meant to illustrate not the algorithms but *how to take | |
185 advantage of the library*. They could also be used as *BLACK BOX* implementations | |
186 by people who don't want to dig lower and just want to run experiments. | |
143 | 187 |
144 Functional Specifications | 188 Functional Specifications |
145 ========================= | 189 ========================= |
146 | 190 |
147 TODO: | 191 TODO: |
149 For each thing with a functional spec (e.g. datasets library, optimization library) make a | 193 For each thing with a functional spec (e.g. datasets library, optimization library) make a |
150 separate file. | 194 separate file. |
151 | 195 |
152 | 196 |
153 | 197 |
154 pylearn.nnet | 198 pylearn.formulas |
155 ------------ | 199 ---------------- |
156 | 200 |
157 Submodule with functions for building layers, calculating classification | 201 Directory with functions for building layers, calculating classification |
158 errors, cross-entropies with various distributions, free energies. This | 202 errors, cross-entropies with various distributions, free energies, etc. This |
159 module would include for the most part global functions, Theano Ops and Theano | 203 module would include for the most part global functions, Theano Ops and Theano |
160 optimizations. | 204 optimizations. |
205 | |
206 Yoshua: I would break it down in module files, e.g.: | |
207 | |
208 pylearn.formulas.costs: generic / common cost functions, e.g. various cross-entropies, squared error, | |
209 abs. error, various sparsity penalties (L1, Student) | |
210 | |
211 pylearn.formulas.linear: formulas for linear classifier, linear regression, factor analysis, PCA | |
212 | |
213 pylearn.formulas.nnet: formulas for building layers of various kinds, various activation functions, | |
214 layers which could be plugged with various costs & penalties, and stacked | |
215 | |
216 pylearn.formulas.ae: formulas for auto-encoders, denoising auto-encoder variants, and corruption processes | |
217 | |
218 pylearn.formulas.rbm: energies, free energies, conditional distributions, Gibbs sampling | |
219 | |
220 pylearn.formulas.trees: formulas for decision trees | |
221 | |
222 pylearn.formulas.boosting: formulas for boosting variants | |
223 | |
224 etc. | |
161 | 225 |
162 Indexing Convention | 226 Indexing Convention |
163 ~~~~~~~~~~~~~~~~~~~ | 227 ~~~~~~~~~~~~~~~~~~~ |
164 | 228 |
165 Something to decide on - Fortran-style or C-style indexing. Although we have | 229 Something to decide on - Fortran-style or C-style indexing. Although we have |