comparison doc/v2_planning/dataset.txt @ 1338:91637815b7ca

Added a comment on the dataset vs. task issue
author Olivier Delalleau <delallea@iro>
date Thu, 21 Oct 2010 14:03:12 -0400
parents 7dfc3d3052ea
children 158493f8dff9
comparison
equal deleted inserted replaced
1337:7dfc3d3052ea 1338:91637815b7ca
565 - It is also ok to have datasets that do not support random access (so the 565 - It is also ok to have datasets that do not support random access (so the
566 only way to access samples is through iteration). 566 only way to access samples is through iteration).
567 - Ideally, data should be deterministic (i.e. __call__() should always 567 - Ideally, data should be deterministic (i.e. __call__() should always
568 return the same thing). It would probably be up to the user to be super 568 return the same thing). It would probably be up to the user to be super
569 careful if he decides to use a non-deterministic dataset. 569 careful if he decides to use a non-deterministic dataset.
570 570 - About the "task vs. dataset" distinction. This could be achieved by
571 associating to a task the names of the fields it requires (e.g. "input"
572 and "target" for the regression task), and if the dataset does not
573 already defines these fields, using a dataset wrapper than does it
574 (saying for instance that "input" is the concatenation of "x1" and "x2",
575 and "target" is "y", for a dataset whose fields are x1, x2 and y).
576