Mercurial > ift6266
diff datasets/defs.py @ 269:4533350d7361
Ajout d'une fonctionnalite pour pouvoir definir un range de fichiers d'entrainement de P07 utilises. Utile pour pre-entrainer et fine-tuner avec des donnees differentes
author | SylvainPL <sylvain.pannetier.lebeuf@umontreal.ca> |
---|---|
date | Sat, 20 Mar 2010 10:19:11 -0400 |
parents | 966272e7f14b |
children | 22efb4968054 |
line wrap: on
line diff
--- a/datasets/defs.py Fri Mar 19 11:31:57 2010 -0400 +++ b/datasets/defs.py Sat Mar 20 10:19:11 2010 -0400 @@ -43,8 +43,10 @@ valid_lbl = [os.path.join(DATA_PATH,'ocr_valid_labels.ft')], indtype=theano.config.floatX, inscale=255., maxsize=maxsize) -nist_P07 = lambda maxsize=None: FTDataSet(train_data = [os.path.join(DATA_PATH,'data/P07_train'+str(i)+'_data.ft') for i in range(100)], - train_lbl = [os.path.join(DATA_PATH,'data/P07_train'+str(i)+'_labels.ft') for i in range(100)], +#There is 2 more arguments here to can choose smaller datasets based on the file number. +#This is usefull to get different data for pre-training and finetuning +nist_P07 = lambda maxsize=None, min_file=0, max_file=100: FTDataSet(train_data = [os.path.join(DATA_PATH,'data/P07_train'+str(i)+'_data.ft') for i in range(min_file, max_file)], + train_lbl = [os.path.join(DATA_PATH,'data/P07_train'+str(i)+'_labels.ft') for i in range(min_file, max_file)], test_data = [os.path.join(DATA_PATH,'data/P07_test_data.ft')], test_lbl = [os.path.join(DATA_PATH,'data/P07_test_labels.ft')], valid_data = [os.path.join(DATA_PATH,'data/P07_valid_data.ft')],