Mercurial > pylearn
diff dataset.py @ 46:c5b07e87b0cb
comments modif made by Yoshua
author | Frederic Bastien <bastienf@iro.umontreal.ca> |
---|---|
date | Tue, 29 Apr 2008 12:37:11 -0400 |
parents | a5c70dc42972 |
children | b6730f9a336d ea7d8bc38b34 |
line wrap: on
line diff
--- a/dataset.py Tue Apr 29 11:25:36 2008 -0400 +++ b/dataset.py Tue Apr 29 12:37:11 2008 -0400 @@ -33,6 +33,8 @@ * for minibatch in dataset.minibatches([field1, field2, ...],minibatch_size=N): * for mini1,mini2,mini3 in dataset.minibatches([field1, field2, ...],minibatch_size=N): * for example in dataset: + print example['x'] + * for x,y,z in dataset: Each of these is documented below. All of these iterators are expected to provide, in addition to the usual 'next()' method, a 'next_index()' method which returns a non-negative integer pointing to the position of the next @@ -43,7 +45,8 @@ dataset length. To iterate over fields, one can do - * for fields in dataset.fields() + * for field in dataset.fields(): + for field_value in field: # iterate over the values associated to that field for all the dataset examples * for fields in dataset(field1,field2,...).fields() to select a subset of fields * for fields in dataset.fields(field1,field2,...) to select a subset of fields and each of these fields is iterable over the examples: @@ -63,7 +66,8 @@ Note: The content of a field can be of any type. Field values can also be 'missing' (e.g. to handle semi-supervised learning), and in the case of numeric (numpy array) - fields (i.e. an ArrayFieldsDataSet), NaN plays the role of a missing value. + fields (i.e. an ArrayFieldsDataSet), NaN plays the role of a missing value. + What about non-numeric values? None. Dataset elements can be indexed and sub-datasets (with a subset of examples) can be extracted. These operations are not supported @@ -101,7 +105,7 @@ works if they all have the same fields. According to the same logic, and viewing a DataSetFields object associated to - a DataSet as a kind of transpose of it, fields1 + fields2 concatenates fields of + a DataSet as a kind of transpose of it, fields1 & fields2 concatenates fields of a DataSetFields fields1 and fields2, and fields1 | fields2 concatenates their examples.