Mercurial > pylearn

--- a/doc/v2_planning/dataset.txt	Fri Sep 10 11:42:48 2010 -0400
+++ b/doc/v2_planning/dataset.txt	Fri Sep 10 12:11:10 2010 -0400
@@ -24,6 +24,16 @@
 - PyML: notion of dataset containers: VectorDataSet, SparseDataSet, KernelData,
   PairDataSet, Aggregate. Ultimately, the learner decides
 - mlpy: very primitive notions of data
+- PyBrain: Datasets are geared towards specific tasks: ClassificationDataSet,
+    SequentialDataSet, ReinforcementDataSet, ... Each class is quite
+    constrained and may have a different interface.
+- MDP: Seems to have restrictions on the type of data being passed around, as
+    well as its dimensionality ("Input array data is typically assumed to be
+    two-dimensional and ordered such that observations of the same variable are
+    stored on rows and different variables are stored on columns.")
+- Orange: Data matrices, with names and types associated to each column.
+  Basically there seems to be only one base dataset class that contains the
+  data. Data points are lists (of values corresponding to each column).
 - (still going through the other ones)

 A few things that our dataset containers should support at a minimum: