Mercurial > pylearn
comparison doc/v2_planning/learner.txt @ 1058:e342de3ae485
v2planning learner - added comments and TODO points
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Thu, 09 Sep 2010 11:49:57 -0400 |
parents | bc3f7834db83 |
children | f082a6c0b008 |
comparison
equal
deleted
inserted
replaced
1057:baf1988db557 | 1058:e342de3ae485 |
---|---|
254 etc.) Such a learner would replace synchronous instructions (return on completion) with | 254 etc.) Such a learner would replace synchronous instructions (return on completion) with |
255 asynchronous ones (return after scheduling) and the active instruction set would also change | 255 asynchronous ones (return after scheduling) and the active instruction set would also change |
256 asynchronously, but neither of these things is inconsistent with the Learner API. | 256 asynchronously, but neither of these things is inconsistent with the Learner API. |
257 | 257 |
258 | 258 |
259 TODO | 259 TODO - Experiment API? |
260 ~~~~ | 260 ~~~~~~~~~~~~~~~~~~~~~~ |
261 | 261 |
262 I feel like something is missing from the API - and that is an interface to the graph structure | 262 I feel like something is missing from the API - and that is an interface to the graph structure |
263 discussed above. The nodes in this graph are natural places to store meta-information for | 263 discussed above. The nodes in this graph are natural places to store meta-information for |
264 visualization, statistics-gathering etc. But none of the APIs above corresponds to the graph | 264 visualization, statistics-gathering etc. But none of the APIs above corresponds to the graph |
265 itself. In other words, there is no API through which to attach information to nodes. It is | 265 itself. In other words, there is no API through which to attach information to nodes. It is |
266 not good to say that the Learner instance *is* the node because (a) learner instances change | 266 not good to say that the Learner instance *is* the node because (a) learner instances change |
267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a | 267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a |
268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills | 268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills |
269 over into other committees, so we should get their feedback about how to resolve it. | 269 over into other committees, so we should get their feedback about how to resolve |
270 | 270 it. Maybe we need an 'Experiment' API to stand for this graph? |
271 Comments | 271 |
272 ~~~~~~~~ | 272 |
273 TODO: Validation & Monitoring Costs | |
274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
275 | |
276 Even if we do have the Experiment API as a structure to hang validation and | |
277 monitoring results, what should be the mechanism for extracting those results. | |
278 The Learner API is not right because extracting a monitoring cost doesn't change | |
279 the model, doesn't change the legal instructions/edges etc. Maybe we should use | |
280 a similar mechanism to Instruction, called something like Measurement? Any node | |
281 / learner can report the list of instructions (for moving) and the list of | |
282 measurements (and the cost of computing them too) | |
283 | |
284 | |
285 TODO - Parameter Distributions | |
286 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
273 | 287 |
274 YB asks: it seems to me that what we really need from "Type" is not just | 288 YB asks: it seems to me that what we really need from "Type" is not just |
275 testing that a value is legal, but more practically a function that specifies the | 289 testing that a value is legal, but more practically a function that specifies the |
276 prior distribution for the hyper-parameter, i.e., how to sample from it, | 290 prior distribution for the hyper-parameter, i.e., how to sample from it, |
277 and possibly some representation of it that could be used to infer | 291 and possibly some representation of it that could be used to infer |
279 Having the min and max and default limits us to the uniform distribution, | 293 Having the min and max and default limits us to the uniform distribution, |
280 which may not always be appropriate. For example sometimes we'd like | 294 which may not always be appropriate. For example sometimes we'd like |
281 Gaussian (-infty to infty) or Exponential (0 to infty) or Poisson (non-negative integers). | 295 Gaussian (-infty to infty) or Exponential (0 to infty) or Poisson (non-negative integers). |
282 For that reason, I think that "Type" is not a very good name. | 296 For that reason, I think that "Type" is not a very good name. |
283 How about "Prior" or "Density" or something like that? | 297 How about "Prior" or "Density" or something like that? |
298 | |
299 JB replies: I agree that being able to choose (and update) distributions over | |
300 these values is important. I don't think the Type structure is the right place | |
301 to handle it though. The challenge is to allow those distributions to change | |
302 for a variety of reasons - e.g. the sampling distribution on the capacity | |
303 variables is affected by the size of the dataset, it is also affected by | |
304 previous experience in general as well as experiments on that particular | |
305 dataset. I'm not sure that the 'Type' structure is right to deal with this. | |
306 Also, even with a strategy for handling these distributions, I believe a simple | |
307 mechanism for rejecting insane values might be useful. | |
308 | |
309 So how should we handle it? Hmmm... | |
310 | |
311 | |
312 Comments | |
313 ~~~~~~~~ | |
284 | 314 |
285 OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm | 315 OD asks: (I hope it's ok to leave comments even though I'm not in committee... I'm |
286 interested to see how the learner interface is shaping up so I'll be keeping | 316 interested to see how the learner interface is shaping up so I'll be keeping |
287 an eye on this file) | 317 an eye on this file) |
288 I'm wondering what's the benefit of such an API compared to simply defining a | 318 I'm wondering what's the benefit of such an API compared to simply defining a |