comparison doc/v2_planning/requirements.txt @ 1093:a65598681620

v2planning - initial commit of use_cases, requirements
author James Bergstra <bergstrj@iro.umontreal.ca>
date Sun, 12 Sep 2010 21:45:22 -0400
parents
children 2bbc294fa5ac
comparison
equal deleted inserted replaced
1092:aab9c261361c 1093:a65598681620
1 ============
2 Requirements
3 ============
4
5
6 Application Requirements
7 ========================
8
9 Terminology and Abbreviations:
10 ------------------------------
11
12 MLA - machine learning algorithm
13
14 learning problem - a machine learning application typically characterized by a
15 dataset (possibly dataset folds) one or more functions to be learned from the
16 data, and one or more metrics to evaluate those functions. Learning problems
17 are the benchmarks for empirical model comparison.
18
19 n. of - number of
20
21 SGD - stochastic gradient descent
22
23 Users:
24 ------
25
26 - New masters and PhD students in the lab should be able to quickly move into
27 'production' mode without having to reinvent the wheel.
28
29 - Students in the two ML classes, able to play with the library to explore new
30 ML variants. This means some APIs (e.g. Experiment level) must be really well
31 documented and conceptually simple.
32
33 - Researchers outside the lab (who might study and experiment with our
34 algorithms)
35
36 - Partners outside the lab (e.g. Bell, Ubisoft) with closed-source commercial
37 projects.
38
39 Uses:
40 -----
41
42 R1. reproduce previous work (our own and others')
43
44 R2. explore MLA variants by swapping components (e.g. optimization algo, dataset,
45 hyper-parameters).
46
47 R3. analyze experimental results (e.g. plotting training curves, finding best
48 models, marginalizing across hyper-parameter choices)
49
50 R4. disseminate (or serve as platform for disseminating) our own published algorithms
51
52 R5. provide implementations of common MLA components (e.g. classifiers, datasets,
53 optimization algorithms, meta-learning algorithms)
54
55 R6. drive large scale parallizable computations (e.g. grid search, bagging,
56 random search)
57
58 R7. provide implementations of standard pre-processing algorithms (e.g. PCA,
59 stemming, Mel-scale spectrograms, GIST features, etc.)
60
61 R8. provide high performance suitable for large-scale experiments,
62
63 R9. be able to use the most efficient algorithms in special case combinations of
64 learning algorithm components (e.g. when there is a fast k-fold validation
65 algorithm for a particular model family, the library should not require users
66 to rewrite their standard k-fold validation script to use it)
67
68 R10. support experiments on a variety of datasets (e.g. movies, images, text,
69 sound, reinforcement learning?)
70
71 R11. support efficient computations on datasets larger than RAM and GPU memory
72
73 R12. support infinite datasets (i.e. generated on the fly)
74
75
76
77 Basic Design Approach
78 =====================
79
80 An ability to drive parallel computations is essential in addressing [R6,R8].
81
82 The basic design approach for the library is to implement
83 - a few virtual machines (VMs), some of which can run programs that can be
84 parallelized across processors, hosts, and networks.
85 - MLAs in a Symbolic Expression language (similar to Theano) as required by
86 [R5,R7,R8]
87
88 MLAs are typically specified by Symbolic programs that are compiled to these
89 instructions, but some MLAs may be implemented in these instructions directly.
90 Symbolic programs are naturally modularized by sub-expressions [R2] and can be
91 optimized automatically (like in Theano) to address [R9].
92
93 A VM that caches instruction return values serves as
94 - a reliable record of what jobs were run [R1]
95 - a database of intermediate results that can be analyzed after the
96 model-training jobs have completed [R3]
97 - a clean API to several possible storage and execution backends.
98
99