# HG changeset patch # User Olivier Delalleau # Date 1289852449 18000 # Node ID f3a549bd8688d4d56e8168162090a64e0c57f6f2 # Parent ad53f73020c2d107c87e1d7b7669e1855ccfec2c datalearn: Added another comment on James' numeric iterator function diff -r ad53f73020c2 -r f3a549bd8688 doc/v2_planning/datalearn.txt --- a/doc/v2_planning/datalearn.txt Mon Nov 15 13:49:25 2010 -0500 +++ b/doc/v2_planning/datalearn.txt Mon Nov 15 15:20:49 2010 -0500 @@ -461,6 +461,28 @@ already compiled in the same program? (note that I am assuming here it is not efficient, but I may be wrong). +OD adds: After thinking more about it, this seems very close to my first +version where a function is automatically compiled "under the hood" when +iterating on a dataset and accessing the numeric value of a resulting +sample. The main differences are: +- In your version, the result is directly a numeric value, while in my version + one would obtain symbolic samples and would need to call some method to + obtain their numeric value. I think I like mine a bit better because it + means you can use the same syntax to e.g. iterate on a dataset, whether you + are interested in the symbolic representation of samples, or their numeric + values. On another hand, doing so could be less efficient since you create an + intermediate representation you may not use. The overhead does not seem much + to me but I am not sure about that. +- In your version, you can provide to the function e.g. compile modes / + givens. This could probably also be done in my version, although it makes it + more difficult if you want to cache the function to avoid compiling it more + than once (see next point). +- (Related to my first comment above) In your version it seems like a new + function would be compiled every time the user calls e.g. + 'numeric_iterator', while in my version the function would be compiled only + once. Maybe this can be solved at the Theano level with an efficient + function cache? + Discussion: Dataset as Learner Ouptut -------------------------------------