# HG changeset patch
# User Olivier Delalleau <delallea@iro>
# Date 1284147383 14400
# Node ID f9f72ae84313026b6db95a409d9cc4d0804e3a61
# Parent  446bd478953ffc6e6eafd119cd9a7b5335c95f87
dataset: Added a couple points we did not have time to discuss during meeting

diff -r 446bd478953f -r f9f72ae84313 doc/v2_planning/dataset.txt
--- a/doc/v2_planning/dataset.txt	Fri Sep 10 14:14:29 2010 -0400
+++ b/doc/v2_planning/dataset.txt	Fri Sep 10 15:36:23 2010 -0400
@@ -204,4 +204,28 @@
 API?
 
 
+Field names and attributes
+~~~~~~~~~~~~~~~~~~~~~~~~~~
 
+OD: One important question is how to handle fields' names and characteristics.
+For instance, it can be useful to know that the 3rd input field represents a
+number of fingers, and is a non-negative discrete field whose numeric value is
+meaningful (compared, to, say, an integer index that would correspond to an
+animal's category). We mentioned metadata during the meeting, but we did not
+get into its details: that may be a place where to put this kind of things.
+
+
+Freeing memory
+~~~~~~~~~~~~~~
+
+OD: It is sometimes useful to be able to free memory used by previous
+computations. A typical example is when you load in memory the original
+dataset, then perform various processing steps, ending with a new dataset that
+you also store in memory before feeding it to the learner. Unless you very
+carefully design your code to avoid it, your original dataset will still
+remain in memory (as well as maybe the results of some computations performed
+along the way). So there may be a use for a `clear()` method that would be
+called by the topmost dataset (the one doing the final memory caching), and
+would be forwarded iteratively to previous datasets so as to get back all this
+wasted memory space.
+