# HG changeset patch # User Razvan Pascanu # Date 1289578309 18000 # Node ID 18b2ebec6bca1a6c7dd269b250929d5d4ab5fa1d # Parent 6b9673d72a4153fcffc1f5c4849d52f8c6914aab Reply to a comment of OD diff -r 6b9673d72a41 -r 18b2ebec6bca doc/v2_planning/datalearn.txt --- a/doc/v2_planning/datalearn.txt Fri Nov 12 10:39:19 2010 -0500 +++ b/doc/v2_planning/datalearn.txt Fri Nov 12 11:11:49 2010 -0500 @@ -204,7 +204,22 @@ You wouldn't need to call some graph.replace method: the graphs compiled for iterating on 'dataset' and 'new_dataset' would be entirely separate (using two different compiled functions, pretty much like #2). - + +RP answers: Yes you are right. What I was trying to say is if you have two +different datasets on which you want to apply the same pre-processing you +can do that in both approaches. ``graph`` represents the pre-processing +steps in (2) and the end dataset (after preprocessing) in (1). So the idea +is that instead of making new_graph from scratch (re-applying all the +transforms on the original dataset) you can use replace. Or maybe the +__call__ (that compiles the function if needed) can get a givens dictionary +( that replaces datasets or more ). I only gave this argument because I +thought this will be an issue people will raise. They will say, well in (2) +the pipeline logic is separated from the data, so you can use the same +transformation with different data easily, while in (1) you write the +transformation rooted in a dataset, and if you want same transformation +for a different dataset you have to re-write everything. + + - in approach (1) the initial dataset object (the one that loads the data) decides if you will use shared variables and indices to deal with the dataset or if you will use ``theano.tensor.matrix`` and not the user( at @@ -272,7 +287,7 @@ the same arguments? Maybe a more generic issue is: would there be a way for Theano to be more efficient when re-compiling the same function that was already compiled in the same program? (note that I am assuming here it is not -efficient, but I may be wrong). +efficient, but I may be wrong). What About Learners? --------------------