# HG changeset patch # User Frederic Bastien # Date 1209487031 14400 # Node ID c5b07e87b0cbec055c7952689a77eb833e91a234 # Parent a5c70dc42972de2691a649c1e370eb9bf115dc25 comments modif made by Yoshua diff -r a5c70dc42972 -r c5b07e87b0cb dataset.py --- a/dataset.py Tue Apr 29 11:25:36 2008 -0400 +++ b/dataset.py Tue Apr 29 12:37:11 2008 -0400 @@ -33,6 +33,8 @@ * for minibatch in dataset.minibatches([field1, field2, ...],minibatch_size=N): * for mini1,mini2,mini3 in dataset.minibatches([field1, field2, ...],minibatch_size=N): * for example in dataset: + print example['x'] + * for x,y,z in dataset: Each of these is documented below. All of these iterators are expected to provide, in addition to the usual 'next()' method, a 'next_index()' method which returns a non-negative integer pointing to the position of the next @@ -43,7 +45,8 @@ dataset length. To iterate over fields, one can do - * for fields in dataset.fields() + * for field in dataset.fields(): + for field_value in field: # iterate over the values associated to that field for all the dataset examples * for fields in dataset(field1,field2,...).fields() to select a subset of fields * for fields in dataset.fields(field1,field2,...) to select a subset of fields and each of these fields is iterable over the examples: @@ -63,7 +66,8 @@ Note: The content of a field can be of any type. Field values can also be 'missing' (e.g. to handle semi-supervised learning), and in the case of numeric (numpy array) - fields (i.e. an ArrayFieldsDataSet), NaN plays the role of a missing value. + fields (i.e. an ArrayFieldsDataSet), NaN plays the role of a missing value. + What about non-numeric values? None. Dataset elements can be indexed and sub-datasets (with a subset of examples) can be extracted. These operations are not supported @@ -101,7 +105,7 @@ works if they all have the same fields. According to the same logic, and viewing a DataSetFields object associated to - a DataSet as a kind of transpose of it, fields1 + fields2 concatenates fields of + a DataSet as a kind of transpose of it, fields1 & fields2 concatenates fields of a DataSetFields fields1 and fields2, and fields1 | fields2 concatenates their examples.