# HG changeset patch # User delallea@opale.iro.umontreal.ca # Date 1212157156 14400 # Node ID ef70a665aaaf416db6a4cc4f27abd202f050a214 # Parent ddb88a8e9fd2e45decf033e50a4b7e78a33254dc# Parent 97f35d58672706c2522c2ce61eae65788681a9e0 Hmm... that was committed by Fred I think, I got lost by Mercurial I think diff -r 97f35d586727 -r ef70a665aaaf dataset.py --- a/dataset.py Thu May 29 10:42:29 2008 -0400 +++ b/dataset.py Fri May 30 10:19:16 2008 -0400 @@ -47,14 +47,14 @@ columns/attributes are called fields. The field value for a particular example can be an arbitrary python object, which depends on the particular dataset. - We call a DataSet a 'stream' when its length is unbounded (otherwise its __len__ method + We call a DataSet a 'stream' when its length is unbounded (in which case its __len__ method should return sys.maxint). A DataSet is a generator of iterators; these iterators can run through the examples or the fields in a variety of ways. A DataSet need not necessarily have a finite or known length, so this class can be used to interface to a 'stream' which feeds on-line learning (however, as noted below, some operations are not - feasible or not recommanded on streams). + feasible or not recommended on streams). To iterate over examples, there are several possibilities: - for example in dataset: @@ -81,7 +81,7 @@ - for field_examples in dataset.fields(): for example_value in field_examples: ... - but when the dataset is a stream (unbounded length), it is not recommanded to do + but when the dataset is a stream (unbounded length), it is not recommended to do such things because the underlying dataset may refuse to access the different fields in an unsynchronized ways. Hence the fields() method is illegal for streams, by default. The result of fields() is a L{DataSetFields} object, which iterates over fields, @@ -599,7 +599,7 @@ * for field_examples in dataset.fields(): for example_value in field_examples: ... - but when the dataset is a stream (unbounded length), it is not recommanded to do + but when the dataset is a stream (unbounded length), it is not recommended to do such things because the underlying dataset may refuse to access the different fields in an unsynchronized ways. Hence the fields() method is illegal for streams, by default. The result of fields() is a DataSetFields object, which iterates over fields,