# HG changeset patch # User Frederic Bastien # Date 1297187835 18000 # Node ID 3823dbfff6cfb4d1f93098245d10e68b1ab56b65 # Parent a36d3a406c59597939efd6620bec6f7d9a70c3d6 add parameter to randomize the valid and test data. diff -r a36d3a406c59 -r 3823dbfff6cf pylearn/datasets/utlc.py --- a/pylearn/datasets/utlc.py Tue Feb 08 12:34:07 2011 -0500 +++ b/pylearn/datasets/utlc.py Tue Feb 08 12:57:15 2011 -0500 @@ -16,13 +16,15 @@ import pylearn.io.filetensor as ft import config -def load_ndarray_dataset(name, normalize=True, transfer=False, normalize_on_the_fly=False): +def load_ndarray_dataset(name, normalize=True, transfer=False, + normalize_on_the_fly=False, randomize_valid=False, + randomize_test=False): """ Load the train,valid,test data for the dataset `name` and return it in ndarray format. :param normalize: If True, we normalize the train dataset before returning it - :param transfer: If True also return the transfer label + :param transfer: If True also return the transfer label(currently only available for ule) :param normalize_on_the_fly: If True, we return a Theano Variable that will give as output the normalized value. If the user only take a subtensor of that variable, Theano optimization @@ -33,6 +35,12 @@ This is usefull to have the original data in its original dtype in memory to same memory. Especialy usefull to be able to use rita and harry with 1G per jobs. + :param randomize_valid: Do we randomize the order of the valid set? + We always use the same random order + If False, return in the same order as downloaded on the web + :param randomize_test: Do we randomize the order of the test set? + We always use the same random order + If False, return in the same order as downloaded on the web """ assert not (normalize and normalize_on_the_fly), "Can't normalize in 2 way at the same time!" @@ -45,6 +53,14 @@ train = load_filetensor(trname) valid = load_filetensor(vname) test = load_filetensor(tename) + if randomize_valid: + rng = numpy.random.RandomState([1,2,3,4]) + perm = rng.permutation(valid.shape[0]) + valid = valid[perm] + if randomize_test: + rng = numpy.random.RandomState([1,2,3,4]) + perm = rng.permutation(test.shape[0]) + test = test[perm] if normalize or normalize_on_the_fly: if normalize_on_the_fly: