diff doc/v2_planning/dataset.txt @ 1054:a474fabd1f37

v2_planning dataset - added questions
author James Bergstra <bergstrj@iro.umontreal.ca>
date Wed, 08 Sep 2010 16:20:20 -0400
parents 1b61cbe0810b
children 20a1af112a75
line wrap: on
line diff
--- a/doc/v2_planning/dataset.txt	Wed Sep 08 16:19:36 2010 -0400
+++ b/doc/v2_planning/dataset.txt	Wed Sep 08 16:20:20 2010 -0400
@@ -153,3 +153,39 @@
 xt,yt = mnist_data.get_batch(batches_train[0])
 xv,yv = mnist_data.get_batch(batches_valid[0])
 
+
+
+
+COMMENTS
+~~~~~~~~
+
+
+JB asks: What may be passed as argument to the functions in Dataset, and what
+can be expected in return?  Are there side effects (e.g. on the state of the
+Dataset) associated with any of the functions?
+
+JB asks: What properties are part of the Dataset API? What possible types can
+they have, are they expected to be read-only or writeable?  What do they mean?
+
+
+JB asks: What is a view?  Does set_view change the Dataset or return a new
+Dataset with a certain view of the original (in which case call it get_view)?
+Does the view imply the types of the return-value of functions like
+get_batch?  What is the difference between the view and the subclasses of
+Dataset in PyML?
+
+JB asks:  Do container formats (I'm thinking of HDF5) offer features for fast
+retrieval that we would like to expose via this interface?
+
+JB asks: How would you recommend using this sort of dataset in a boosting
+algorithm where points need to be re-weighted.
+
+
+JB asks: Do we want to provide for the possibility of feedback that modifies the
+dataset?  For example, curriculum learning might be adaptive in this sense, or
+if we wanted to provide a virtual world for an agent as a dataset then we need
+to provide 'actions' to get the next batch.  Could this be done in the current
+API?
+
+
+