Mercurial > pylearn
annotate doc/v2_planning/requirements.txt @ 1145:d6d73a9f07b8
API_coding_style: Started to work on official guidelines
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Thu, 16 Sep 2010 16:11:26 -0400 |
parents | 1f5465622394 |
children | 7d34edde029d |
rev | line source |
---|---|
1093
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
1 ============ |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
2 Requirements |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
3 ============ |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
4 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
5 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
6 Application Requirements |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
7 ======================== |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
8 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
9 Terminology and Abbreviations: |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
10 ------------------------------ |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
11 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
12 MLA - machine learning algorithm |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
13 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
14 learning problem - a machine learning application typically characterized by a |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
15 dataset (possibly dataset folds) one or more functions to be learned from the |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
16 data, and one or more metrics to evaluate those functions. Learning problems |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
17 are the benchmarks for empirical model comparison. |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
18 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
19 n. of - number of |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
20 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
21 SGD - stochastic gradient descent |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
23 Users: |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
24 ------ |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
25 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
26 - New masters and PhD students in the lab should be able to quickly move into |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
27 'production' mode without having to reinvent the wheel. |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
28 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
29 - Students in the two ML classes, able to play with the library to explore new |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
30 ML variants. This means some APIs (e.g. Experiment level) must be really well |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
31 documented and conceptually simple. |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
32 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
33 - Researchers outside the lab (who might study and experiment with our |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
34 algorithms) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
35 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
36 - Partners outside the lab (e.g. Bell, Ubisoft) with closed-source commercial |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
37 projects. |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
38 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
39 Uses: |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
40 ----- |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
41 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
42 R1. reproduce previous work (our own and others') |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
43 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
44 R2. explore MLA variants by swapping components (e.g. optimization algo, dataset, |
1096
2bbc294fa5ac
requirements: Added a use case
Olivier Delalleau <delallea@iro>
parents:
1093
diff
changeset
|
45 hyper-parameters) |
1093
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
46 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
47 R3. analyze experimental results (e.g. plotting training curves, finding best |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
48 models, marginalizing across hyper-parameter choices) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
49 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
50 R4. disseminate (or serve as platform for disseminating) our own published algorithms |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
51 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
52 R5. provide implementations of common MLA components (e.g. classifiers, datasets, |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
53 optimization algorithms, meta-learning algorithms) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
54 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
55 R6. drive large scale parallizable computations (e.g. grid search, bagging, |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
56 random search) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
57 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
58 R7. provide implementations of standard pre-processing algorithms (e.g. PCA, |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
59 stemming, Mel-scale spectrograms, GIST features, etc.) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
60 |
1096
2bbc294fa5ac
requirements: Added a use case
Olivier Delalleau <delallea@iro>
parents:
1093
diff
changeset
|
61 R8. provide high performance suitable for large-scale experiments |
1093
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
62 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
63 R9. be able to use the most efficient algorithms in special case combinations of |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
64 learning algorithm components (e.g. when there is a fast k-fold validation |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
65 algorithm for a particular model family, the library should not require users |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
66 to rewrite their standard k-fold validation script to use it) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
67 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
68 R10. support experiments on a variety of datasets (e.g. movies, images, text, |
1096
2bbc294fa5ac
requirements: Added a use case
Olivier Delalleau <delallea@iro>
parents:
1093
diff
changeset
|
69 sound, reinforcement learning?) |
1093
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
70 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
71 R11. support efficient computations on datasets larger than RAM and GPU memory |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
72 |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
73 R12. support infinite datasets (i.e. generated on the fly) |
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
74 |
1098
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1096
diff
changeset
|
75 R13. apply trained models "in production". |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1096
diff
changeset
|
76 - e.g. say you try many combinations of preprocessing, models and associated |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1096
diff
changeset
|
77 hyper-parameters, and want to easily be able to recover the full "processing |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1096
diff
changeset
|
78 pipeline" that performs best, and use it on real/test data later. |
1093
a65598681620
v2planning - initial commit of use_cases, requirements
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
79 |
1121
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
80 OD comments: Note that R9 and R13 may conflict with each other. Some |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
81 optimizations performed by R9 may modify the input "symbolic graph" in such a |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
82 way that extracting the required components for "production purpose" (R13) |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
83 could be made more difficult (or even impossible). Imagine for instance that |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
84 the graph is modified to take advantage of the fact that k-fold validation can |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
85 be performed efficiently internally by some specific algorithm. Then it may |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
86 not be obvious anymore how to remove the k-fold split in the saved model you |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
87 want to use in production. |
1f5465622394
requirements: Added comment about potentially conflicting requirements
Olivier Delalleau <delallea@iro>
parents:
1098
diff
changeset
|
88 |