Mercurial > pylearn
comparison doc/v2_planning/use_cases.txt @ 1095:520fcaa45692
Merged
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Mon, 13 Sep 2010 09:15:25 -0400 |
parents | a65598681620 |
children | 8be7928cc1aa |
comparison
equal
deleted
inserted
replaced
1094:75175e2e697d | 1095:520fcaa45692 |
---|---|
1 | |
2 Use Cases (Functional Requirements) | |
3 =================================== | |
4 | |
5 These use cases exhibit pseudo-code for some of the sorts of tasks listed in the | |
6 requirements (requirements.txt) | |
7 | |
8 | |
9 Evaluate a classifier on MNIST | |
10 ------------------------------- | |
11 | |
12 The evaluation of a classifier on MNIST requires iterating over examples in some | |
13 set (e.g. validation, test) and comparing the model's prediction with the | |
14 correct answer. The score of the classifier is the number of correct | |
15 predictions divided by the total number of predictions. | |
16 | |
17 To perform this calculation, the user should specify: | |
18 - the classifier (e.g. a function operating on weights loaded from disk) | |
19 - the dataset (e.g. MNIST) | |
20 - the subset of examples on which to evaluate (e.g. test set) | |
21 | |
22 For example: | |
23 | |
24 vm.call(classification_accuracy( | |
25 function = classifier, | |
26 examples = MNIST.validation_iterator)) | |
27 | |
28 | |
29 The user types very few things beyond the description of the fields necessary | |
30 for the computation, no boilerplate. The `MNIST.validation_iterator` must | |
31 respect a protocol that remains to be worked out. | |
32 | |
33 The `vm.call` is a compilation & execution step, as opposed to the | |
34 symbolic-graph building performed by the `classification_accuracy` call. | |
35 | |
36 | |
37 | |
38 Train a linear classifier on MNIST | |
39 ---------------------------------- | |
40 | |
41 The training of a linear classifier requires specification of | |
42 | |
43 - problem dimensions (e.g. n. of inputs, n. of classes) | |
44 - parameter initialization method | |
45 - regularization | |
46 - dataset | |
47 - schedule for obtaining training examples (e.g. batch, online, minibatch, | |
48 weighted examples) | |
49 - algorithm for adapting parameters (e.g. SGD, Conj. Grad) | |
50 - a stopping criterion (may be in terms of validation examples) | |
51 | |
52 Often the dataset determines the problem dimensions. | |
53 | |
54 Often the training examples and validation examples come from the same set (e.g. | |
55 a large matrix of all examples) but this is not necessarily the case. | |
56 | |
57 There are many ways that the training could be configured, but here is one: | |
58 | |
59 | |
60 vm.call( | |
61 halflife_stopper( | |
62 initial_model=random_linear_classifier(MNIST.n_inputs, MNIST.n_hidden, r_seed=234432), | |
63 burnin=100, | |
64 score_fn = vm_lambda(('learner_obj',), | |
65 classification_accuracy( | |
66 examples=MNIST.validation_dataset, | |
67 function=as_classifier('learner_obj'))), | |
68 step_fn = vm_lambda(('learner_obj',), | |
69 sgd_step_fn( | |
70 parameters = vm_getattr('learner_obj', 'params'), | |
71 cost_and_updates=classif_nll('learner_obj', | |
72 example_stream=minibatches( | |
73 source=MNIST.training_dataset, | |
74 batchsize=100, | |
75 loop=True)), | |
76 momentum=0.9, | |
77 anneal_at_iter=50, | |
78 n_iter=100))) #step_fn goes through lots of examples (e.g. an epoch) | |
79 | |
80 Although I expect this specific code might have to change quite a bit in a final | |
81 version, I want to draw attention to a few aspects of it: | |
82 | |
83 - we build a symbolic expression graph that contains the whole program, not just | |
84 the learning algorithm | |
85 | |
86 - the configuration language allows for callable objects (e.g. functions, | |
87 curried functions) to be arguments | |
88 | |
89 - there is a lambda function-constructor (vm_lambda) we can use in this language | |
90 | |
91 - APIs and protocols are at work in establishing conventions for | |
92 parameter-passing so that sub-expressions (e.g. datasets, optimization | |
93 algorithms, etc.) can be swapped. | |
94 | |
95 - there are no APIs for things which are not passed as arguments (i.e. the logic | |
96 of the whole program is not exposed via some uber-API). | |
97 | |
98 | |
99 K-fold cross validation of a classifier | |
100 --------------------------------------- | |
101 | |
102 splits = kfold_cross_validate( | |
103 indexlist = range(1000) | |
104 train = 8, | |
105 valid = 1, | |
106 test = 1, | |
107 ) | |
108 | |
109 trained_models = [ | |
110 halflife_early_stopper( | |
111 initial_model=alloc_model('param1', 'param2'), | |
112 burnin=100, | |
113 score_fn = vm_lambda(('learner_obj',), | |
114 graph=classification_error( | |
115 function=as_classifier('learner_obj'), | |
116 dataset=MNIST.subset(validation_set))), | |
117 step_fn = vm_lambda(('learner_obj',), | |
118 sgd_step_fn( | |
119 parameters = vm_getattr('learner_obj', 'params'), | |
120 cost_and_updates=classif_nll('learner_obj', | |
121 example_stream=minibatches( | |
122 source=MNIST.subset(train_set), | |
123 batchsize=100, | |
124 loop=True)), | |
125 n_iter=100))) | |
126 for (train_set, validation_set, test_set) in splits] | |
127 | |
128 vm.call(trained_models, param1=1, param2=2) | |
129 vm.call(trained_models, param1=3, param2=4) | |
130 | |
131 I want to draw attention to the fact that the call method treats the expression | |
132 tree as one big lambda expression, with potentially free variables that must be | |
133 assigned - here the 'param1' and 'param2' arguments to `alloc_model`. There is | |
134 no need to have separate compile and run steps like in Theano because these | |
135 functions are expected to be long-running, and called once. | |
136 | |
137 | |
138 Analyze the results of the K-fold cross validation | |
139 -------------------------------------------------- | |
140 | |
141 It often happens that a user doesn't know what statistics to compute *before* | |
142 running a bunch of learning jobs, but only afterward. This can be done by | |
143 extending the symbolic program, and calling the extended function. | |
144 | |
145 vm.call( | |
146 [pylearn.min(model.weights) for model in trained_models], | |
147 param1=1, param2=2) | |
148 | |
149 If this is run after the previous calls: | |
150 | |
151 vm.call(trained_models, param1=1, param2=2) | |
152 vm.call(trained_models, param1=3, param2=4) | |
153 | |
154 Then it should run very quickly, because the `vm` can cache the return values of | |
155 the trained_models when param1=1 and param2=2. | |
156 | |
157 |