diff doc/v2_planning/dataset.txt @ 1104:5e6d7d9e803a

a comment on the GPU issue for datasets
author Razvan Pascanu <r.pascanu@gmail.com>
date Mon, 13 Sep 2010 20:21:23 -0400
parents 75175e2e697d
children 546bd0ccb0e4
line wrap: on
line diff
--- a/doc/v2_planning/dataset.txt	Mon Sep 13 16:50:24 2010 -0400
+++ b/doc/v2_planning/dataset.txt	Mon Sep 13 20:21:23 2010 -0400
@@ -324,3 +324,17 @@
 understanding of it, but my feeling is that you need your learner to be
 written in a specific way to achieve this, in which case it may be up to the
 learner to take its input data and store it into a shared variable.
+
+RP comment: Yes, the dataset object alone can not handle this, the issue is somewhere 
+between the dataset and the learner. Or in other words, everytime you change
+the data you need to recompile your theano function. So the learner can not
+only get data from the dataset, it needs to get a shared variable. The learner
+should also be aware when the dataset is changed, to recompile its internal 
+functions. I'm not sure which is the best wa to do this. My personal feeling
+is that the dataset should be part of the learner. The lerner should provide
+a function use_dataset ( or replace_dataset). When this function is called,
+all the theano functions in the learner get recompiled based on shared
+variables that the dataset object provides. It sort of fits very well in the 
+framework that I have in mind, which was spattered around in the learner.txt
+and some of my previous emails. I think it shares a lot with James concepts, 
+since it follows quite closely the concepts behind Theano.