Mercurial > pylearn
annotate doc/v2_planning/main_plan.txt @ 1507:2a6a6f16416c
fix import.
author | Frederic Bastien <nouiz@nouiz.org> |
---|---|
date | Mon, 12 Sep 2011 11:45:41 -0400 |
parents | 0e12ea6ba661 |
children |
rev | line source |
---|---|
941 | 1 |
2 Motivation | |
3 ========== | |
4 | |
1007
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
5 Yoshua (points discussed Thursday Sept 2, 2010 at LISA tea-talk) |
1189
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
6 ---------------------------------------------------------------- |
1007
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
7 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
8 ****** Why we need to get better organized in our code-writing ****** |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
9 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
10 - current state of affairs on top of Theano is anarchic and does not lend itself to easy code re-use |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
11 - the lab is growing and will continue to grow significantly, and more people outside the lab are using Theano |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
12 - we have new industrial partners and funding sources that demand deliverables, and more/better collectively organized efforts |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
13 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
14 *** Who can take advantage of this *** |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
15 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
16 - us, directly, taking advantage of the different advances made by different researchers in the lab to yield better models |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
17 - us, easier to compare different models and different datasets with different metrics on different computing platforms available to us |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
18 - future us, new students, able to quickly move into 'production' mode without having to reinvent the wheel |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
19 - students in the two ML classes, able to play with the library to explore new ML variants |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
20 - other ML researchers in academia, able to play with our algorithms, try new variants, cite our papers |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
21 - non-ML users in or out of academia, and our user-partners |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
22 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
23 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
24 *** Move with care *** |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
25 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
26 - Write down use-cases, examples for each type of module, do not try to be TOO general |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
27 - Want to keep ease of exploring and flexibility, not create a prison |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
28 - Too many constraints can lead to paralysis, especially in C++ object-oriented model |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
29 - Too few guidelines lead to code components that are not interchangeable |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
30 - Poor code practice leads to buggy, spaguetti code |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
31 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
32 *** What *** |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
33 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
34 - define standards |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
35 - write-up a few instances of each basic type (dataset, learner, optimizer, hyper-parameter exploration boilerplate, etc.) enough to implement some of the basic algorithms we use often (e.g. like those in the tutorials) |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
36 - let the library grow according to our needs |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
37 - keep tight reins on it to control quality |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
38 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
39 *** Content and Form *** |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
40 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
41 We need to establish guidelines and conventions for |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
42 |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
43 * Content: what are the re-usable components? define conventions or API for each, make sure they fit with each other |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
44 * Form: social engineering, coding practices and conventions, code review, incentives |
2e515be92a0e
motivations and meeting points
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1001
diff
changeset
|
45 |
941 | 46 Yoshua: |
47 ------- | |
48 | |
49 We are missing a *Theano Machine Learning library*. | |
50 | |
51 The deep learning tutorials do a good job but they lack the following features, which I would like to see in a ML library: | |
52 | |
53 - a well-organized collection of Theano symbolic expressions (formulas) for handling most of | |
54 what is needed either in implementing existing well-known ML and deep learning algorithms or | |
55 for creating new variants (without having to start from scratch each time), that is the | |
56 mathematical core, | |
57 | |
58 - a well-organized collection of python modules to help with the following: | |
59 - several data-access models that wrap around learning algorithms for interfacing with various types of data (static vectors, images, sound, video, generic time-series, etc.) | |
60 - generic utility code for optimization | |
61 - stochastic gradient descent variants | |
62 - early stopping variants | |
63 - interfacing to generic 2nd order optimization methods | |
64 - 2nd order methods tailored to work on minibatches | |
65 - optimizers for sparse coefficients / parameters | |
66 - generic code for model selection and hyper-parameter optimization (including the use and coordination of multiple jobs running on different machines, e.g. using jobman) | |
67 - generic code for performance estimation and experimental statistics | |
68 - visualization tools (using existing python libraries) and examples for all of the above | |
69 - learning algorithm conventions and meta-learning algorithms (bagging, boosting, mixtures of experts, etc.) which use them | |
70 | |
71 [Note that many of us already use some instance of all the above, but each one tends to reinvent the wheel and newbies don't benefit from a knowledge base.] | |
72 | |
73 - a well-documented set of python scripts using the above library to show how to run the most | |
74 common ML algorithms (possibly with examples showing how to run multiple experiments with | |
75 many different models and collect statistical comparative results). This is particularly | |
76 important for pure users to adopt Theano in the ML application work. | |
77 | |
78 Ideally, there would be one person in charge of this project, making sure a coherent and | |
79 easy-to-read design is developed, along with many helping hands (to implement the various | |
80 helper modules, formulae, and learning algorithms). | |
81 | |
82 | |
83 James: | |
84 ------- | |
85 | |
86 I am interested in the design and implementation of the "well-organized collection of Theano | |
87 symbolic expressions..." | |
88 | |
89 I would like to explore algorithms for hyper-parameter optimization, following up on some | |
90 "high-throughput" work. I'm most interested in the "generic code for model selection and | |
91 hyper-parameter optimization..." and "generic code for performance estimation...". | |
92 | |
93 I have some experiences with the data-access requirements, and some lessons I'd like to share | |
94 on that, but no time to work on that aspect of things. | |
95 | |
96 I will continue to contribute to the "well-documented set of python scripts using the above to | |
97 showcase common ML algorithms...". I have an Olshausen&Field-style sparse coding script that | |
98 could be polished up. I am also implementing the mcRBM and I'll be able to add that when it's | |
99 done. | |
100 | |
101 | |
102 | |
103 Suggestions for how to tackle various desiderata | |
104 ================================================ | |
105 | |
106 | |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
107 Theano Symbolic Expressions for ML |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
108 ---------------------------------- |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
109 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
110 We could make this a submodule of pylearn: ``pylearn.nnet``. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
111 |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
112 Yoshua: I would use a different name, e.g., "pylearn.formulas" to emphasize that it is not just |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
113 about neural nets, and that this is a collection of formulas (expressions), rather than |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
114 completely self-contained classes for learners. We could have a "nnet.py" file for |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
115 neural nets, though. |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
116 |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
117 There are a number of ideas floating around for how to handle classes / |
947 | 118 modules (LeDeepNet, pylearn.shared.layers, pynnet, DeepAnn) so lets implement as much |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
119 math as possible in global functions with no classes. There are no models in |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
120 the wish list that require than a few vectors and matrices to parametrize. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
121 Global functions are more reusable than classes. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
122 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
123 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
124 Data access |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
125 ----------- |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
126 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
127 A general interface to datasets from the perspective of an experiment driver |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
128 (e.g. kfold) is to see them as a function that maps index (typically integer) |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
129 to example (whose type and nature depends on the dataset, it could for |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
130 instance be an (image, label) pair). This interface permits iterating over |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
131 the dataset, shuffling the dataset, and splitting it into folds. For |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
132 efficiency, it is nice if the dataset interface supports looking up several |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
133 index values at once, because looking up many examples at once can sometimes |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
134 be faster than looking each one up in turn. In particular, looking up |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
135 a consecutive block of indices, or a slice, should be well supported. |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
136 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
137 Some datasets may not support random access (e.g. a random number stream) and |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
138 that's fine if an exception is raised. The user will see a NotImplementedError |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
139 or similar, and try something else. We might want to have a way to test |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
140 that a dataset is random-access or not without having to load an example. |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
141 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
142 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
143 A more intuitive interface for many datasets (or subsets) is to load them as |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
144 matrices or lists of examples. This format is more convenient to work with at |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
145 an ipython shell, for example. It is not good to provide only the "dataset |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
146 as a function" view of a dataset. Even if a dataset is very large, it is nice |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
147 to have a standard way to get some representative examples in a convenient |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
148 structure, to be able to play with them in ipython. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
149 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
150 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
151 Another thing to consider related to datasets is that there are a number of |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
152 other efforts to have standard ML datasets, and we should be aware of them, |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
153 and compatible with them when it's easy: |
1189
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
154 |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
155 - mldata.org (they have a file format, not sure how many use it) |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
156 - weka (ARFF file format) |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
157 - scikits.learn |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
158 - hdf5 / pytables |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
159 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
160 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
161 pylearn.datasets uses a DATA_ROOT environment variable to locate a filesystem |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
162 folder that is assumed to have a standard form across different installations. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
163 That's where the data files are. The correct format of this folder is currently |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
164 defined implicitly by the contents of /data/lisa/data at DIRO, but it would be |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
165 better to document in pylearn what the contents of this folder should be as |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
166 much as possible. It should be possible to rebuild this tree from information |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
167 found in pylearn. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
168 |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
169 Yoshua (about ideas proposed by Pascal Vincent a while ago): |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
170 |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
171 - we may want to distinguish between datasets and tasks: a task defines |
1189
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
172 not just the data but also things like what is the input and what is the |
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
173 target (for supervised learning), and *importantly* a set of performance metrics |
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
174 that make sense for this task (e.g. those used by papers solving a particular |
0e12ea6ba661
fix many rst syntax error warning.
Frederic Bastien <nouiz@nouiz.org>
parents:
1112
diff
changeset
|
175 task, or reported for a particular benchmark) |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
176 |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
177 - we should discuss about a few "standards" that datasets and tasks may comply to, such as |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
178 - "input" and "target" fields inside each example, for supervised or semi-supervised learning tasks |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
179 (with a convention for the semi-supervised case when only the input or only the target is observed) |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
180 - "input" for unsupervised learning |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
181 - conventions for missing-valued components inside input or target |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
182 - how examples that are sequences are treated (e.g. the input or the target is a sequence) |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
183 - how time-stamps are specified when appropriate (e.g., the sequences are asynchronous) |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
184 - how error metrics are specified |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
185 * example-level statistics (e.g. classification error) |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
186 * dataset-level statistics (e.g. ROC curve, mean and standard error of error) |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
187 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
188 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
189 Model Selection & Hyper-Parameter Optimization |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
190 ---------------------------------------------- |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
191 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
192 Driving a distributed computing job for a long time to optimize |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
193 hyper-parameters using one or more clusters is the goal here. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
194 Although there might be some library-type code to write here, I think of this |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
195 more as an application template. The user would use python code to describe |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
196 the experiment to run and the hyper-parameter space to search. Then this |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
197 application-driver would take control of scheduling jobs and running them on |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
198 various computers... I'm imagining a potentially ugly brute of a hack that's |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
199 not necessarily something we will want to expose at a low-level for reuse. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
200 |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
201 Yoshua: We want both the library-defined driver that takes instructions about how to generate |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
202 new hyper-parameter combinations (e.g. implicitly providing a prior distribution from which |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
203 to sample them), and examples showing how to use it in typical cases. |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
204 Note that sometimes we just want to find the best configuration of hyper-parameters, |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
205 but sometimes we want to do more subtle analysis. Often a combination of both. |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
206 In this respect it could be useful for the user to define hyper-parameters over |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
207 which scientific questions are sought (e.g. depth of an architecture) vs |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
208 hyper-parameters that we would like to marginalize/maximize over (e.g. learning rate). |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
209 This can influence both the sampling of configurations (we want to make sure that all |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
210 combinations of question-driving hyper-parameters are covered) and the analysis |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
211 of results (we may be willing to estimate ANOVAs or averaging or quantiles over |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
212 the non-question-driving hyper-parameters). |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
213 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
214 Python scripts for common ML algorithms |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
215 --------------------------------------- |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
216 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
217 The script aspect of this feature request makes me think that what would be |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
218 good here is more tutorial-type scripts. And the existing tutorials could |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
219 potentially be rewritten to use some of the pylearn.nnet expressions. More |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
220 tutorials / demos would be great. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
221 |
946
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
222 Yoshua: agreed that we could write them as tutorials, but note how the |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
223 spirit would be different from the current deep learning tutorials: we would |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
224 not mind using library code as much as possible instead of trying to flatten |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
225 out everything in the interest of pedagogical simplicity. Instead, these |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
226 tutorials should be meant to illustrate not the algorithms but *how to take |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
227 advantage of the library*. They could also be used as *BLACK BOX* implementations |
7c4504a4ce1a
additions to formulas, data access, hyper-params, scripts
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
945
diff
changeset
|
228 by people who don't want to dig lower and just want to run experiments. |
941 | 229 |
230 Functional Specifications | |
231 ========================= | |
232 | |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
233 TODO: |
941 | 234 Put these into different text files so that this one does not become a monster. |
235 For each thing with a functional spec (e.g. datasets library, optimization library) make a | |
236 separate file. | |
237 | |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
238 Indexing Convention |
1112
1ed0719cfbce
fix markup that make the doc generator fail
Frederic Bastien <nouiz@nouiz.org>
parents:
1051
diff
changeset
|
239 =================== |
945
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
240 |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
241 Something to decide on - Fortran-style or C-style indexing. Although we have |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
242 often used c-style indexing in the past (for efficiency in c!) this is no |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
243 longer an issue with numpy because the physical layout is independent of the |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
244 indexing order. The fact remains that Fortran-style indexing follows linear |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
245 algebra conventions, while c-style indexing does not. If a global function |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
246 includes a lot of math derivations, it would be *really* nice if the code used |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
247 the same convention for the orientation of matrices, and endlessly annoying to |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
248 have to be always transposing everything. |
cafa16bfc7df
additions to v2_planning
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
941
diff
changeset
|
249 |