Mercurial > pylearn
comparison doc/v2_planning.txt @ 944:1529c84e460f
merge
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 11 Aug 2010 13:16:51 -0400 |
parents | 939806d33183 |
children | cafa16bfc7df |
comparison
equal
deleted
inserted
replaced
943:0181459b53a1 | 944:1529c84e460f |
---|---|
1 | |
2 Motivation | |
3 ========== | |
4 | |
5 Yoshua: | |
6 ------- | |
7 | |
8 We are missing a *Theano Machine Learning library*. | |
9 | |
10 The deep learning tutorials do a good job but they lack the following features, which I would like to see in a ML library: | |
11 | |
12 - a well-organized collection of Theano symbolic expressions (formulas) for handling most of | |
13 what is needed either in implementing existing well-known ML and deep learning algorithms or | |
14 for creating new variants (without having to start from scratch each time), that is the | |
15 mathematical core, | |
16 | |
17 - a well-organized collection of python modules to help with the following: | |
18 - several data-access models that wrap around learning algorithms for interfacing with various types of data (static vectors, images, sound, video, generic time-series, etc.) | |
19 - generic utility code for optimization | |
20 - stochastic gradient descent variants | |
21 - early stopping variants | |
22 - interfacing to generic 2nd order optimization methods | |
23 - 2nd order methods tailored to work on minibatches | |
24 - optimizers for sparse coefficients / parameters | |
25 - generic code for model selection and hyper-parameter optimization (including the use and coordination of multiple jobs running on different machines, e.g. using jobman) | |
26 - generic code for performance estimation and experimental statistics | |
27 - visualization tools (using existing python libraries) and examples for all of the above | |
28 - learning algorithm conventions and meta-learning algorithms (bagging, boosting, mixtures of experts, etc.) which use them | |
29 | |
30 [Note that many of us already use some instance of all the above, but each one tends to reinvent the wheel and newbies don't benefit from a knowledge base.] | |
31 | |
32 - a well-documented set of python scripts using the above library to show how to run the most | |
33 common ML algorithms (possibly with examples showing how to run multiple experiments with | |
34 many different models and collect statistical comparative results). This is particularly | |
35 important for pure users to adopt Theano in the ML application work. | |
36 | |
37 Ideally, there would be one person in charge of this project, making sure a coherent and | |
38 easy-to-read design is developed, along with many helping hands (to implement the various | |
39 helper modules, formulae, and learning algorithms). | |
40 | |
41 | |
42 James: | |
43 ------- | |
44 | |
45 I am interested in the design and implementation of the "well-organized collection of Theano | |
46 symbolic expressions..." | |
47 | |
48 I would like to explore algorithms for hyper-parameter optimization, following up on some | |
49 "high-throughput" work. I'm most interested in the "generic code for model selection and | |
50 hyper-parameter optimization..." and "generic code for performance estimation...". | |
51 | |
52 I have some experiences with the data-access requirements, and some lessons I'd like to share | |
53 on that, but no time to work on that aspect of things. | |
54 | |
55 I will continue to contribute to the "well-documented set of python scripts using the above to | |
56 showcase common ML algorithms...". I have an Olshausen&Field-style sparse coding script that | |
57 could be polished up. I am also implementing the mcRBM and I'll be able to add that when it's | |
58 done. | |
59 | |
60 | |
61 | |
62 Suggestions for how to tackle various desiderata | |
63 ================================================ | |
64 | |
65 | |
66 | |
67 Functional Specifications | |
68 ========================= | |
69 | |
70 Put these into different text files so that this one does not become a monster. | |
71 For each thing with a functional spec (e.g. datasets library, optimization library) make a | |
72 separate file. | |
73 |