Mercurial > pylearn
comparison doc/v2_planning/datalearn.txt @ 1362:6b9673d72a41
Datalearn replies / comments
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Fri, 12 Nov 2010 10:39:19 -0500 |
parents | 7548dc1b163c |
children | 18b2ebec6bca |
comparison
equal
deleted
inserted
replaced
1361:7548dc1b163c | 1362:6b9673d72a41 |
---|---|
50 up the ability to combine Theano expressions coded in individual datasets? | 50 up the ability to combine Theano expressions coded in individual datasets? |
51 Firstly, if you want to use Theano expressions and compiled functions to | 51 Firstly, if you want to use Theano expressions and compiled functions to |
52 implement the perform() method of an Op, you can do that. Secondly, you can | 52 implement the perform() method of an Op, you can do that. Secondly, you can |
53 just include those 'expressions coded in individual datasets' into the overall | 53 just include those 'expressions coded in individual datasets' into the overall |
54 graph. | 54 graph. |
55 | |
56 OD replies to James: What I had in mind is you would be forced to compile your | |
57 own function inside the perform() method of an Op. This seemed like a | |
58 potential problem to me because it would prevent Theano from seeing the whole | |
59 fine-grained graph and do optimizations across multiple dataset | |
60 transformations (there may also be additional overhead from calling multiple | |
61 function). But if you are saying it is possible to include 'expressions coded | |
62 in individual datasets' into the overall graph, then I guess this point is | |
63 moot. Would this be achieved with an optimization that replaces the dataset | |
64 node with its internal graph? | |
55 | 65 |
56 Razvan comments: 1) Having Theano expressions inside the perform of a Theano | 66 Razvan comments: 1) Having Theano expressions inside the perform of a Theano |
57 Op can lead to issues. I know I had to deal with a few when implementing | 67 Op can lead to issues. I know I had to deal with a few when implementing |
58 Scan which does exactly this. Well to be fair these issues mostly come into | 68 Scan which does exactly this. Well to be fair these issues mostly come into |
59 play when the inner graph has to interact with the outer graph and most of | 69 play when the inner graph has to interact with the outer graph and most of |
67 that gives you shared variables, symbolic indices into that shared | 77 that gives you shared variables, symbolic indices into that shared |
68 variables, and also numeric indices. When looping through those numeric | 78 variables, and also numeric indices. When looping through those numeric |
69 indices, the dataset class can reload parts of the data into the | 79 indices, the dataset class can reload parts of the data into the |
70 shared variable and so on. | 80 shared variable and so on. |
71 | 81 |
72 | 82 OD replies to Razvan's point 2: I think what you are saying is another concern |
83 I had, which was the fact it may be confusing to mix in the same class the | |
84 Variable/Op and DataSet interfaces. I would indeed prefer to keep them | |
85 separate. However, it may be possible to come up with a system that would get | |
86 the best of both worlds (maybe by having the Op/Variable as members of | |
87 Dataset, and just asking the user building a theano graph to use these instead | |
88 of the dataset directly). Note that I'm mixing up Op/Variable here, because | |
89 it's just not clear yet for me which would go where... | |
73 | 90 |
74 One issue with this approach is illustrated by the following example. Imagine | 91 One issue with this approach is illustrated by the following example. Imagine |
75 we want to iterate on samples in a dataset and do something with their | 92 we want to iterate on samples in a dataset and do something with their |
76 numeric value. We would want the code to be as close as possible to: | 93 numeric value. We would want the code to be as close as possible to: |
77 | 94 |
141 explicit because you have to manually compile your functions. | 158 explicit because you have to manually compile your functions. |
142 - approach (1) needs to use this function_storage trick shared between | 159 - approach (1) needs to use this function_storage trick shared between |
143 certain nodes of the graph to reduce the number of compilation while in | 160 certain nodes of the graph to reduce the number of compilation while in |
144 approach (2) we don't need to deal with the complexity of lazy | 161 approach (2) we don't need to deal with the complexity of lazy |
145 compilation | 162 compilation |
163 | |
164 OD comments: Well, to be fair, it means we put the burden of dealing with the | |
165 complexity of lazy compilation on the user (it's up to him to make sure he | |
166 compiles only one function). | |
167 | |
146 - approach (1) needs a replace function if you want to change the dataset. | 168 - approach (1) needs a replace function if you want to change the dataset. |
147 What you would do, is once you have a "computational graph" or pipeline | 169 What you would do, is once you have a "computational graph" or pipeline |
148 or whatever you call it, say ``graph``, to change the input you would do | 170 or whatever you call it, say ``graph``, to change the input you would do |
149 graph.replace({ init_data_X: new_data_X}), In approach (2) the init_data_X | 171 graph.replace({ init_data_X: new_data_X}), In approach (2) the init_data_X |
150 and new_data_X is the ``dataset`` so you would compile two different | 172 and new_data_X is the ``dataset`` so you would compile two different |
172 | 194 |
173 new_graph = graph.replace({dataset:dataset2}) | 195 new_graph = graph.replace({dataset:dataset2}) |
174 | 196 |
175 for datapoint in new_graph: | 197 for datapoint in new_graph: |
176 do_something_with(datapoint()) | 198 do_something_with(datapoint()) |
199 | |
200 OD comments: I don't really understand what is 'graph' in this code (it | |
201 appears in both approaches but is used differently). What I have in mind would | |
202 be more with 'graph' removed in the first approach you describe (#2), and | |
203 graph / new_graph replaced by dataset / new_dataset in the second one (#1). | |
204 You wouldn't need to call some graph.replace method: the graphs compiled for | |
205 iterating on 'dataset' and 'new_dataset' would be entirely separate (using two | |
206 different compiled functions, pretty much like #2). | |
177 | 207 |
178 - in approach (1) the initial dataset object (the one that loads the data) | 208 - in approach (1) the initial dataset object (the one that loads the data) |
179 decides if you will use shared variables and indices to deal with the | 209 decides if you will use shared variables and indices to deal with the |
180 dataset or if you will use ``theano.tensor.matrix`` and not the user( at | 210 dataset or if you will use ``theano.tensor.matrix`` and not the user( at |
181 least not without hacking the code). Of course whoever writes that class | 211 least not without hacking the code). Of course whoever writes that class |
223 types of Variables that would represent Parameters and Hyper-parameters. | 253 types of Variables that would represent Parameters and Hyper-parameters. |
224 And as an ending note I would say that there are | 254 And as an ending note I would say that there are |
225 hyper-parameters for which you need to recompile the thenao function and | 255 hyper-parameters for which you need to recompile the thenao function and |
226 can not be just parameters ( so we would have yet another category ?). | 256 can not be just parameters ( so we would have yet another category ?). |
227 | 257 |
228 Another syntactic option for iterating over datasets is | 258 James: Another syntactic option for iterating over datasets is |
229 | 259 |
230 .. code-block:: python | 260 .. code-block:: python |
231 | 261 |
232 for sample in dataset.numeric_iterator(batchsize=10): | 262 for sample in dataset.numeric_iterator(batchsize=10): |
233 do_something_with(sample) | 263 do_something_with(sample) |
235 The numeric_iterator would create a symbolic batch index, and compile a single function | 265 The numeric_iterator would create a symbolic batch index, and compile a single function |
236 that extracts the corresponding minibatch. The arguments to the | 266 that extracts the corresponding minibatch. The arguments to the |
237 numeric_iterator function can also specify what compile mode to use, any givens | 267 numeric_iterator function can also specify what compile mode to use, any givens |
238 you might want to apply, etc. | 268 you might want to apply, etc. |
239 | 269 |
270 OD comments: Would there also be some kind of function cache to avoid | |
271 compiling the same function again if we re-iterate on the same dataset with | |
272 the same arguments? Maybe a more generic issue is: would there be a way for | |
273 Theano to be more efficient when re-compiling the same function that was | |
274 already compiled in the same program? (note that I am assuming here it is not | |
275 efficient, but I may be wrong). | |
240 | 276 |
241 What About Learners? | 277 What About Learners? |
242 -------------------- | 278 -------------------- |
243 | 279 |
244 The discussion above only mentioned datasets, but not learners. The learning | 280 The discussion above only mentioned datasets, but not learners. The learning |
248 | 284 |
249 James asks: | 285 James asks: |
250 What's wrong with simply passing the variables corresponding to the dataset to | 286 What's wrong with simply passing the variables corresponding to the dataset to |
251 the constructor of the learner? | 287 the constructor of the learner? |
252 That seems much more flexible, compact, and clear than the decorator. | 288 That seems much more flexible, compact, and clear than the decorator. |
289 | |
290 OD replies: Not sure I understand your idea here. We probably want a learner | |
291 to be able to compute its output on multiple datasets, without having to point | |
292 to these datasets within the learner itself (which seems cumbersome to me). | |
293 The point of the decorators is mostly to turn a single function (that outputs | |
294 a theano variable for the ouptut computed on a single sample) into a function | |
295 that can compute symbolic datasets as well as numeric sample outputs. Those | |
296 could also be instead different functions in the base Learner class if the | |
297 decorator approach is considered ugly / confusing. | |
253 | 298 |
254 A Learner may be able to compute various things. For instance, a Neural | 299 A Learner may be able to compute various things. For instance, a Neural |
255 Network may output a ``prediction`` vector (whose elements correspond to | 300 Network may output a ``prediction`` vector (whose elements correspond to |
256 estimated probabilities of each class in a classification task), as well as a | 301 estimated probabilities of each class in a classification task), as well as a |
257 ``cost`` vector (whose elements correspond to the penalized NLL, the NLL alone | 302 ``cost`` vector (whose elements correspond to the penalized NLL, the NLL alone |
328 for datapoint in results: | 373 for datapoint in results: |
329 print datapoint.prediction, datapoint.nll, ... | 374 print datapoint.prediction, datapoint.nll, ... |
330 | 375 |
331 Is this close to what you are suggesting? | 376 Is this close to what you are suggesting? |
332 | 377 |
378 OD: Yes, you guessed right, the decorator's role is to do something different | |
379 depending on the input to the function (see my reply to James above). |