annotate datasets/dataset.py @ 163:4b28d7382dbf

Add inital implementation of datasets. For the moment only nist_digits is defined.
author Arnaud Bergeron <abergeron@gmail.com>
date Thu, 25 Feb 2010 18:40:01 -0500
parents
children d6672a7daea5
rev   line source
163
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
1 from dsetiter import DataIterator
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
2
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
3 class DataSet(object):
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
4 def test(self, batchsize, bufsize=None):
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
5 r"""
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
6 Returns an iterator over the test examples.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
7
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
8 Parameters
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
9 batchsize (int) -- the size of the minibatches, 0 means
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
10 return the whole set at once.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
11 bufsize (int, optional) -- the size of the in-memory buffer,
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
12 0 to disable.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
13 """
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
14 return self._return_it(batchsize, bufsize, self._test)
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
15
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
16 def train(self, batchsize, bufsize=None):
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
17 r"""
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
18 Returns an iterator over the training examples.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
19
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
20 Parameters
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
21 batchsize (int) -- the size of the minibatches, 0 means
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
22 return the whole set at once.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
23 bufsize (int, optional) -- the size of the in-memory buffer,
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
24 0 to disable.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
25 """
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
26 return self._return_it(batchsize, bufsize, self._train)
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
27
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
28 def valid(self, batchsize, bufsize=None):
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
29 r"""
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
30 Returns an iterator over the validation examples.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
31
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
32 Parameters
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
33 batchsize (int) -- the size of the minibatches, 0 means
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
34 return the whole set at once.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
35 bufsize (int, optional) -- the size of the in-memory buffer,
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
36 0 to disable.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
37 """
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
38 return self._return_it(batchsize, bufsize, self._valid)
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
39
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
40 def _return_it(batchsize, bufsize, data):
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
41 r"""
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
42 Must return an iterator over the specified dataset (`data`).
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
43
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
44 Implement this in subclassses.
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
45 """
4b28d7382dbf Add inital implementation of datasets.
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
46 raise NotImplemented