Mercurial > pylearn

--- a/doc/v2_planning/dataset.txt	Thu Oct 21 14:36:36 2010 -0400
+++ b/doc/v2_planning/dataset.txt	Thu Oct 21 16:18:52 2010 -0400
@@ -595,3 +595,21 @@
  which would collect the results for you from sql, and give them to you as
  data object.

+OD replies: Actually this should be doable with (almost) what I wrote above,
+due to the way numpy redefines ==, >, etc. (which btw should break some of my
+assertions above, since I had forgotten about this). If you replace e.g. my
+implementation of __eq__ above by the following:
+
+.. code-block:: python
+
+    def __eq__(self, other):
+        return other == self()
+
+Here, `self` is a dataset that represents some numpy vector data. Then whether
+`other` is another dataset or a numpy vector or some scalar, this will return
+a numpy boolean vector (the result of the comparison made by numpy). We may
+support boolean vectors in advanced indexing, so you could do
+    d[d.some_field == 5]
+and obtain the subset of `d` whose samples have `some_field` set to 5.
+Same could be done with __lt__, __le__, etc.
+