Mercurial > pylearn
changeset 1082:f9f72ae84313
dataset: Added a couple points we did not have time to discuss during meeting
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Fri, 10 Sep 2010 15:36:23 -0400 |
parents | 446bd478953f |
children | 4c00af69c164 |
files | doc/v2_planning/dataset.txt |
diffstat | 1 files changed, 24 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/doc/v2_planning/dataset.txt Fri Sep 10 14:14:29 2010 -0400 +++ b/doc/v2_planning/dataset.txt Fri Sep 10 15:36:23 2010 -0400 @@ -204,4 +204,28 @@ API? +Field names and attributes +~~~~~~~~~~~~~~~~~~~~~~~~~~ +OD: One important question is how to handle fields' names and characteristics. +For instance, it can be useful to know that the 3rd input field represents a +number of fingers, and is a non-negative discrete field whose numeric value is +meaningful (compared, to, say, an integer index that would correspond to an +animal's category). We mentioned metadata during the meeting, but we did not +get into its details: that may be a place where to put this kind of things. + + +Freeing memory +~~~~~~~~~~~~~~ + +OD: It is sometimes useful to be able to free memory used by previous +computations. A typical example is when you load in memory the original +dataset, then perform various processing steps, ending with a new dataset that +you also store in memory before feeding it to the learner. Unless you very +carefully design your code to avoid it, your original dataset will still +remain in memory (as well as maybe the results of some computations performed +along the way). So there may be a use for a `clear()` method that would be +called by the topmost dataset (the one doing the final memory caching), and +would be forwarded iteratively to previous datasets so as to get back all this +wasted memory space. +