Mercurial > pylearn
diff doc/v2_planning/dataset.txt @ 1054:a474fabd1f37
v2_planning dataset - added questions
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 08 Sep 2010 16:20:20 -0400 |
parents | 1b61cbe0810b |
children | 20a1af112a75 |
line wrap: on
line diff
--- a/doc/v2_planning/dataset.txt Wed Sep 08 16:19:36 2010 -0400 +++ b/doc/v2_planning/dataset.txt Wed Sep 08 16:20:20 2010 -0400 @@ -153,3 +153,39 @@ xt,yt = mnist_data.get_batch(batches_train[0]) xv,yv = mnist_data.get_batch(batches_valid[0]) + + + +COMMENTS +~~~~~~~~ + + +JB asks: What may be passed as argument to the functions in Dataset, and what +can be expected in return? Are there side effects (e.g. on the state of the +Dataset) associated with any of the functions? + +JB asks: What properties are part of the Dataset API? What possible types can +they have, are they expected to be read-only or writeable? What do they mean? + + +JB asks: What is a view? Does set_view change the Dataset or return a new +Dataset with a certain view of the original (in which case call it get_view)? +Does the view imply the types of the return-value of functions like +get_batch? What is the difference between the view and the subclasses of +Dataset in PyML? + +JB asks: Do container formats (I'm thinking of HDF5) offer features for fast +retrieval that we would like to expose via this interface? + +JB asks: How would you recommend using this sort of dataset in a boosting +algorithm where points need to be re-weighted. + + +JB asks: Do we want to provide for the possibility of feedback that modifies the +dataset? For example, curriculum learning might be adaptive in this sense, or +if we wanted to provide a virtual world for an agent as a dataset then we need +to provide 'actions' to get the next batch. Could this be done in the current +API? + + +