comparison doc/v2_planning/use_cases.txt @ 1189:0e12ea6ba661

fix many rst syntax error warning.
author Frederic Bastien <nouiz@nouiz.org>
date Fri, 17 Sep 2010 20:55:18 -0400
parents 21d25bed2ce9
children
comparison
equal deleted inserted replaced
1188:073c2fab7bcd 1189:0e12ea6ba661
54 Often the training examples and validation examples come from the same set (e.g. 54 Often the training examples and validation examples come from the same set (e.g.
55 a large matrix of all examples) but this is not necessarily the case. 55 a large matrix of all examples) but this is not necessarily the case.
56 56
57 There are many ways that the training could be configured, but here is one: 57 There are many ways that the training could be configured, but here is one:
58 58
59 .. code-block:: python
59 60
60 vm.call( 61 vm.call(
61 halflife_stopper( 62 halflife_stopper(
62 # OD: is n_hidden supposed to be n_classes instead? 63 # OD: is n_hidden supposed to be n_classes instead?
63 initial_model=random_linear_classifier(MNIST.n_inputs, MNIST.n_hidden, r_seed=234432), 64 initial_model=random_linear_classifier(MNIST.n_inputs, MNIST.n_hidden, r_seed=234432),
64 burnin=100, 65 burnin=100,
65 score_fn = vm_lambda(('learner_obj',), 66 score_fn = vm_lambda(('learner_obj',),
106 linear classifier. I hope that, as much as possible, we can avoid the need to 107 linear classifier. I hope that, as much as possible, we can avoid the need to
107 specify dataset dimensions / number of classes in algorithm constructors. I 108 specify dataset dimensions / number of classes in algorithm constructors. I
108 regularly had issues in PLearn with the fact we had for instance to give the 109 regularly had issues in PLearn with the fact we had for instance to give the
109 number of inputs when creating a neural network. I much prefer when this kind 110 number of inputs when creating a neural network. I much prefer when this kind
110 of thing can be figured out at runtime: 111 of thing can be figured out at runtime:
111 - Any parameter you can get rid of is a significant gain in 112
112 user-friendliness. 113 - Any parameter you can get rid of is a significant gain in
113 - It's not always easy to know in advance e.g. the dimension of your input 114 user-friendliness.
114 dataset. Imagine for instance this dataset is obtained in a first step 115 - It's not always easy to know in advance e.g. the dimension of your input
115 by going through a PCA whose number of output dimensions is set so as to 116 dataset. Imagine for instance this dataset is obtained in a first step
116 keep 90% of the variance. 117 by going through a PCA whose number of output dimensions is set so as to
117 - It seems to me it fits better the idea of a symbolic graph: my intuition 118 keep 90% of the variance.
118 (that may be very different from what you actually have in mind) is to 119 - It seems to me it fits better the idea of a symbolic graph: my intuition
119 see an experiment as a symbolic graph, which you instantiate when you 120 (that may be very different from what you actually have in mind) is to
120 provide the input data. One advantage of this point of view is it makes 121 see an experiment as a symbolic graph, which you instantiate when you
121 it natural to re-use the same block components on various datasets / 122 provide the input data. One advantage of this point of view is it makes
122 splits, something we often want to do. 123 it natural to re-use the same block components on various datasets /
124 splits, something we often want to do.
123 125
124 K-fold cross validation of a classifier 126 K-fold cross validation of a classifier
125 --------------------------------------- 127 ---------------------------------------
128
129 .. code-block:: python
126 130
127 splits = kfold_cross_validate( 131 splits = kfold_cross_validate(
128 # OD: What would these parameters mean? 132 # OD: What would these parameters mean?
129 indexlist = range(1000) 133 indexlist = range(1000)
130 train = 8, 134 train = 8,