ift6266: datasets/defs.py annotate

annotate datasets/defs.py @ 189:0d0677773533

Fix bug where there would be a bunch of 0-length batches at the end under certain circumstances.

author	Arnaud Bergeron <abergeron@gmail.com>
date	Mon, 01 Mar 2010 17:06:49 -0500
parents	f0f47b045cbf
children	476da2ba6a12

rev	line source
181 f0f47b045cbf Remove a stray cast in the FTDataSet code and export the ocr dataset. Arnaud Bergeron <abergeron@gmail.com> parents: 180 diff changeset	1 __all__ = ['nist_digits', 'nist_lower', 'nist_upper', 'nist_all', 'ocr']
163 4b28d7382dbf Add inital implementation of datasets. Arnaud Bergeron <abergeron@gmail.com> parents: diff changeset	2
4b28d7382dbf Add inital implementation of datasets. Arnaud Bergeron <abergeron@gmail.com> parents: diff changeset	3 from ftfile import FTDataSet
180 76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	4 import theano
163 4b28d7382dbf Add inital implementation of datasets. Arnaud Bergeron <abergeron@gmail.com> parents: diff changeset	5
175 224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	6 NIST_PATH = '/data/lisa/data/nist/by_class/'
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	7 DATA_PATH = '/data/lisa/data/ift6266h10/'
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	8
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	9 nist_digits = FTDataSet(train_data = [NIST_PATH+'digits/digits_train_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	10 train_lbl = [NIST_PATH+'digits/digits_train_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	11 test_data = [NIST_PATH+'digits/digits_test_data.ft'],
180 76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	12 test_lbl = [NIST_PATH+'digits/digits_test_labels.ft'],
76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	13 indtype=theano.config.floatX, inscale=255.)
175 224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	14 nist_lower = FTDataSet(train_data = [NIST_PATH+'lower/lower_train_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	15 train_lbl = [NIST_PATH+'lower/lower_train_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	16 test_data = [NIST_PATH+'lower/lower_test_data.ft'],
180 76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	17 test_lbl = [NIST_PATH+'lower/lower_test_labels.ft'],
76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	18 indtype=theano.config.floatX, inscale=255.)
175 224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	19 nist_upper = FTDataSet(train_data = [NIST_PATH+'upper/upper_train_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	20 train_lbl = [NIST_PATH+'upper/upper_train_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	21 test_data = [NIST_PATH+'upper/upper_test_data.ft'],
180 76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	22 test_lbl = [NIST_PATH+'upper/upper_test_labels.ft'],
76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	23 indtype=theano.config.floatX, inscale=255.)
163 4b28d7382dbf Add inital implementation of datasets. Arnaud Bergeron <abergeron@gmail.com> parents: diff changeset	24
175 224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	25 nist_all = FTDataSet(train_data = [DATA_PATH+'train_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	26 train_lbl = [DATA_PATH+'train_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	27 test_data = [DATA_PATH+'test_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	28 test_lbl = [DATA_PATH+'test_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	29 valid_data = [DATA_PATH+'valid_data.ft'],
180 76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	30 valid_lbl = [DATA_PATH+'valid_labels.ft'],
76bc047df5ee Add dtype conversion and rescaling to the read path. Arnaud Bergeron <abergeron@gmail.com> parents: 175 diff changeset	31 indtype=theano.config.floatX, inscale=255.)
175 224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	32
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	33 ocr = FTDataSet(train_data = [DATA_PATH+'ocr_train_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	34 train_lbl = [DATA_PATH+'ocr_train_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	35 test_data = [DATA_PATH+'ocr_test_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	36 test_lbl = [DATA_PATH+'ocr_test_labels.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	37 valid_data = [DATA_PATH+'ocr_valid_data.ft'],
224321bf043a Define the ocr dataset and use the existing split for nist. Arnaud Bergeron <abergeron@gmail.com> parents: 164 diff changeset	38 valid_lbl = [DATA_PATH+'ocr_valid_labels.ft'])

Mercurial > ift6266

annotate datasets/defs.py @ 189:0d0677773533