Mercurial > pylearn
comparison doc/v2_planning/datalearn.txt @ 1361:7548dc1b163c
Some question/suggestions to datalearn
author | Razvan Pascanu <r.pascanu@gmail.com> |
---|---|
date | Thu, 11 Nov 2010 22:40:01 -0500 |
parents | 5db730bb0e8e |
children | 6b9673d72a41 |
comparison
equal
deleted
inserted
replaced
1360:f81b3b6f9698 | 1361:7548dc1b163c |
---|---|
51 Firstly, if you want to use Theano expressions and compiled functions to | 51 Firstly, if you want to use Theano expressions and compiled functions to |
52 implement the perform() method of an Op, you can do that. Secondly, you can | 52 implement the perform() method of an Op, you can do that. Secondly, you can |
53 just include those 'expressions coded in individual datasets' into the overall | 53 just include those 'expressions coded in individual datasets' into the overall |
54 graph. | 54 graph. |
55 | 55 |
56 Razvan comments: 1) Having Theano expressions inside the perform of a Theano | |
57 Op can lead to issues. I know I had to deal with a few when implementing | |
58 Scan which does exactly this. Well to be fair these issues mostly come into | |
59 play when the inner graph has to interact with the outer graph and most of | |
60 the time they can be solved. I guess all that I'm saying is going that way | |
61 might lead to some head-ache to developers, though I guess some head-ache | |
62 will be involved no matter what | |
63 2) In my view (I'm not sure this is what Olivier was saying) the idea of | |
64 not putting the Dataset into a Variable is to not put the logic related to | |
65 loading data, dividing it into slices when running it on the GPU and so on | |
66 into a theano variable. In my view this logic goes into a DataSet class | |
67 that gives you shared variables, symbolic indices into that shared | |
68 variables, and also numeric indices. When looping through those numeric | |
69 indices, the dataset class can reload parts of the data into the | |
70 shared variable and so on. | |
71 | |
72 | |
73 | |
56 One issue with this approach is illustrated by the following example. Imagine | 74 One issue with this approach is illustrated by the following example. Imagine |
57 we want to iterate on samples in a dataset and do something with their | 75 we want to iterate on samples in a dataset and do something with their |
58 numeric value. We would want the code to be as close as possible to: | 76 numeric value. We would want the code to be as close as possible to: |
59 | 77 |
60 .. code-block:: python | 78 .. code-block:: python |
108 for numeric_index in xrange(len(dataset)) | 126 for numeric_index in xrange(len(dataset)) |
109 do_something_with(get_sample(numeric_index)) | 127 do_something_with(get_sample(numeric_index)) |
110 | 128 |
111 James comments: this is how I have written the last couple of projects, it's | 129 James comments: this is how I have written the last couple of projects, it's |
112 slightly verbose but it's clear and efficient. | 130 slightly verbose but it's clear and efficient. |
131 | |
132 <Razvan comments>: I assume that ``do_something_with`` is suppose to be some | |
133 numeric function, and dataset in this case is the result of some | |
134 computations on a initial dataset. | |
135 I would differentiate the two approaches (1) and (2) as : | |
136 - first of all whatever you can do with (1) you can do with (2) | |
137 - approach (1) hides the fact that you are working with symbolic graphs. | |
138 You apply functions to datasets, and when you want to see values a | |
139 function is compiled under the hood and those values are computed for | |
140 you. In approach (2) the fact that you deal with a symbolic graph is | |
141 explicit because you have to manually compile your functions. | |
142 - approach (1) needs to use this function_storage trick shared between | |
143 certain nodes of the graph to reduce the number of compilation while in | |
144 approach (2) we don't need to deal with the complexity of lazy | |
145 compilation | |
146 - approach (1) needs a replace function if you want to change the dataset. | |
147 What you would do, is once you have a "computational graph" or pipeline | |
148 or whatever you call it, say ``graph``, to change the input you would do | |
149 graph.replace({ init_data_X: new_data_X}), In approach (2) the init_data_X | |
150 and new_data_X is the ``dataset`` so you would compile two different | |
151 functions. Well I would re-write (2) -- to make the above more clear -- | |
152 as : | |
153 | |
154 .. code-block:: python | |
155 | |
156 symbolic_index = theano.tensor.iscalar() | |
157 get_sample1 = theano.function( [symbolic_index], | |
158 graph( dataset[symbolic_index] ).variable) | |
159 for numeric_index in xrange(len(dataset)): | |
160 do_something_with(get_sample(numeric_index)) | |
161 | |
162 get_sample2 = theano.function( [symbolic_index], | |
163 graph( new_dataset[symbolic_index] ).variable) | |
164 ## Note: the dataset was replaced with new_dataset | |
165 for numeric_index in xrange(len(new_dataset)): | |
166 do_something_with(get_sample2(numeric_index)) | |
167 | |
168 ######### FOR (1) you write: | |
169 | |
170 for datapoint in graph: | |
171 do_something_with( datapoint() ) | |
172 | |
173 new_graph = graph.replace({dataset:dataset2}) | |
174 | |
175 for datapoint in new_graph: | |
176 do_something_with(datapoint()) | |
177 | |
178 - in approach (1) the initial dataset object (the one that loads the data) | |
179 decides if you will use shared variables and indices to deal with the | |
180 dataset or if you will use ``theano.tensor.matrix`` and not the user( at | |
181 least not without hacking the code). Of course whoever writes that class | |
182 can add a flag to it to switch between behaviours that make sense. | |
183 In approach (2) one is not forced to do this | |
184 inside that class by construction, though by convention I would do it. | |
185 So if you consider the one who writes that class as a developer than | |
186 in (2) the user can decide/deal with this and not the developer. | |
187 Though this is a fine-line -- I would say the user would actually | |
188 write that class as well using some template. | |
189 That is to say (2) looks and feels more like working with Theano | |
190 directly, | |
191 | |
192 Bottom line, I think (1) puts more stress on the development of the library, | |
193 and hides Theano and some of the complexity for day to day usage. | |
194 In (2) everything is a bit more explicit, leaving the impression that you | |
195 have more control over the code, though I strongly feel that whatever can | |
196 be done in (2) can be done in (1). Traditionally I was more inclined | |
197 towards (1) but now I'm not that sure, I think both are equally interesting | |
198 and valid options. | |
199 </Razvan comments> | |
113 | 200 |
114 Note that although the above example focused on how to iterate over a dataset, | 201 Note that although the above example focused on how to iterate over a dataset, |
115 it can be cast into a more generic problem, where some data (either dataset or | 202 it can be cast into a more generic problem, where some data (either dataset or |
116 sample) is the result of some transformation applied to other data, which is | 203 sample) is the result of some transformation applied to other data, which is |
117 parameterized by parameters p1, p2, ..., pN (in the above example, we were | 204 parameterized by parameters p1, p2, ..., pN (in the above example, we were |
122 Ideally it would be nice to let the user take control on what is being | 209 Ideally it would be nice to let the user take control on what is being |
123 compiled, while leaving the option of using a default sensible behavior for | 210 compiled, while leaving the option of using a default sensible behavior for |
124 those who do not want to worry about it. How to achieve this is still to be | 211 those who do not want to worry about it. How to achieve this is still to be |
125 determined. | 212 determined. |
126 | 213 |
214 Razvan Comment: I thought about this a bit at the Pylearn level. In my | |
215 original train of thought you would have the distinction between ``hand | |
216 picked parameters`` which I would call hyper-parameter and learned | |
217 parameters. A transformation in this framework (an op if you wish) could | |
218 take as inputs DataSet(s), DataField(s), Parameter(s) (which are the things | |
219 that the learner should adapt) and HyperParameter(s). All hyper-parameters | |
220 will turn into arguments of the compiled function (like the indices of each | |
221 of the dataset objects ) and therefore they can be changed without | |
222 re-compilation. Or in other words this can be easily done by having new | |
223 types of Variables that would represent Parameters and Hyper-parameters. | |
224 And as an ending note I would say that there are | |
225 hyper-parameters for which you need to recompile the thenao function and | |
226 can not be just parameters ( so we would have yet another category ?). | |
127 | 227 |
128 Another syntactic option for iterating over datasets is | 228 Another syntactic option for iterating over datasets is |
129 | 229 |
130 .. code-block:: python | 230 .. code-block:: python |
131 | 231 |
210 In the code above, if one wants to obtain the numeric value of an element of | 310 In the code above, if one wants to obtain the numeric value of an element of |
211 ``multiple_fields_dataset``, the Theano function being compiled would be able | 311 ``multiple_fields_dataset``, the Theano function being compiled would be able |
212 to optimize computations so that the simultaneous computation of | 312 to optimize computations so that the simultaneous computation of |
213 ``prediction`` and ``cost`` is done efficiently. | 313 ``prediction`` and ``cost`` is done efficiently. |
214 | 314 |
315 Razvan asks: What is predict_sample for ? What is predict_dataset? What I | |
316 guess you mean is that the decorator is used to convert a function that | |
317 takes a theano variable and outputs a theano variable into a class/function | |
318 that takes a DataField/DataSet and outputs a DataField/DataSet. It could | |
319 also register all those different functions, so that the Dataset that | |
320 you get out of (not one of the function) the entire Learner (this Dataset | |
321 is returned by __call__) would contain all those as fields. | |
322 I would use it like this: | |
323 | |
324 .. code-block:: python | |
325 | |
326 nnet = NeuralNetwork() | |
327 results = nnet(dataset) | |
328 for datapoint in results: | |
329 print datapoint.prediction, datapoint.nll, ... | |
330 | |
331 Is this close to what you are suggesting? | |
332 |