Mercurial > pylearn
comparison dataset.py @ 269:fdce496c3b56
deprecating __getitem__[fieldname] syntax
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 04 Jun 2008 19:04:40 -0400 |
parents | 3f1cd8897fda |
children | fa8abc813bd2 |
comparison
equal
deleted
inserted
replaced
268:3f1cd8897fda | 269:fdce496c3b56 |
---|---|
107 | 107 |
108 - dataset[i] returns an Example. | 108 - dataset[i] returns an Example. |
109 | 109 |
110 - dataset[[i1,i2,...in]] returns a dataset with examples i1,i2,...in. | 110 - dataset[[i1,i2,...in]] returns a dataset with examples i1,i2,...in. |
111 | 111 |
112 - dataset[fieldname] an iterable over the values of the field fieldname across | |
113 the dataset (the iterable is obtained by default by calling valuesVStack | |
114 over the values for individual examples). | |
115 | |
116 - dataset.<property> returns the value of a property associated with | 112 - dataset.<property> returns the value of a property associated with |
117 the name <property>. The following properties should be supported: | 113 the name <property>. The following properties should be supported: |
118 - 'description': a textual description or name for the dataset | 114 - 'description': a textual description or name for the dataset |
119 - 'fieldtypes': a list of types (one per field) | 115 - 'fieldtypes': a list of types (one per field) |
120 A DataSet may have other attributes that it makes visible to other objects. These are | 116 A DataSet may have other attributes that it makes visible to other objects. These are |
149 | 145 |
150 A DataSet sub-class should always redefine the following methods: | 146 A DataSet sub-class should always redefine the following methods: |
151 - __len__ if it is not a stream | 147 - __len__ if it is not a stream |
152 - fieldNames | 148 - fieldNames |
153 - minibatches_nowrap (called by DataSet.minibatches()) | 149 - minibatches_nowrap (called by DataSet.minibatches()) |
150 For efficiency of implementation, a sub-class might also want to redefine | |
154 - valuesHStack | 151 - valuesHStack |
155 - valuesVStack | 152 - valuesVStack |
156 For efficiency of implementation, a sub-class might also want to redefine | |
157 - hasFields | 153 - hasFields |
158 - __getitem__ may not be feasible with some streams | 154 - __getitem__ may not be feasible with some streams |
159 - __iter__ | 155 - __iter__ |
160 A sub-class should also append attributes to self._attribute_names | 156 A sub-class should also append attributes to self._attribute_names |
161 (the default value returned by attributeNames()). | 157 (the default value returned by attributeNames()). |
410 """ | 406 """ |
411 Return a DataSetFields object associated with this dataset. | 407 Return a DataSetFields object associated with this dataset. |
412 """ | 408 """ |
413 return DataSetFields(self,fieldnames) | 409 return DataSetFields(self,fieldnames) |
414 | 410 |
411 def getitem_key(self, fieldname): | |
412 """A not-so-well thought-out place to put code that used to be in | |
413 getitem. | |
414 """ | |
415 #removing as per discussion June 4. --JSB | |
416 | |
417 i = fieldname | |
418 # else check for a fieldname | |
419 if self.hasFields(i): | |
420 return self.minibatches(fieldnames=[i],minibatch_size=len(self),n_batches=1,offset=0).next()[0] | |
421 # else we are trying to access a property of the dataset | |
422 assert i in self.__dict__ # else it means we are trying to access a non-existing property | |
423 return self.__dict__[i] | |
424 | |
415 def __getitem__(self,i): | 425 def __getitem__(self,i): |
416 """ | 426 """ |
417 dataset[i] returns the (i+1)-th example of the dataset. | 427 dataset[i] returns the (i+1)-th example of the dataset. |
418 dataset[i:j] returns the subdataset with examples i,i+1,...,j-1. | 428 dataset[i:j] returns the subdataset with examples i,i+1,...,j-1. |
419 dataset[i:j:s] returns the subdataset with examples i,i+2,i+4...,j-2. | 429 dataset[i:j:s] returns the subdataset with examples i,i+2,i+4...,j-2. |
458 return MinibatchDataSet( | 468 return MinibatchDataSet( |
459 Example(self.fieldNames(),[ self.valuesVStack(fieldname,field_values) | 469 Example(self.fieldNames(),[ self.valuesVStack(fieldname,field_values) |
460 for fieldname,field_values | 470 for fieldname,field_values |
461 in zip(self.fieldNames(),fields_values)]), | 471 in zip(self.fieldNames(),fields_values)]), |
462 self.valuesVStack,self.valuesHStack) | 472 self.valuesVStack,self.valuesHStack) |
463 # else check for a fieldname | 473 raise TypeError(i, type(i)) |
464 if self.hasFields(i): | |
465 return self.minibatches(fieldnames=[i],minibatch_size=len(self),n_batches=1,offset=0).next()[0] | |
466 # else we are trying to access a property of the dataset | |
467 assert i in self.__dict__ # else it means we are trying to access a non-existing property | |
468 return self.__dict__[i] | |
469 | 474 |
470 def valuesHStack(self,fieldnames,fieldvalues): | 475 def valuesHStack(self,fieldnames,fieldvalues): |
471 """ | 476 """ |
472 Return a value that corresponds to concatenating (horizontally) several field values. | 477 Return a value that corresponds to concatenating (horizontally) several field values. |
473 This can be useful to merge some fields. The implementation of this operation is likely | 478 This can be useful to merge some fields. The implementation of this operation is likely |