changeset 1092:aab9c261361c

neural_net: Added info about how PLearn was doing it
author Olivier Delalleau <delallea@iro>
date Sun, 12 Sep 2010 15:12:19 -0400
parents 319de699fb67
children a65598681620 75175e2e697d
files doc/v2_planning/neural_net.txt
diffstat 1 files changed, 52 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/doc/v2_planning/neural_net.txt	Sun Sep 12 14:14:23 2010 -0400
+++ b/doc/v2_planning/neural_net.txt	Sun Sep 12 15:12:19 2010 -0400
@@ -21,3 +21,55 @@
 pseudo-code for some tasks ( that vary as much as possible) + text describing
 all the missing details.
 
+Link with PLearn
+----------------
+
+OD: This is basically what the OnlineLearningModule framework was doing in
+PLearn (c.f. PLearn/plearn_learners/online). Basically, the idea was that a
+module was a "box" with so-called "ports" representing inputs / outputs. So
+for instance you could think of an RBM as a module with "visible" and
+"hidden" ports, but also "log_p_visible", "energy", etc. You would use
+such a module by calling an fprop method where you would give some values for
+input ports (not necessarily all of them), and would ask some output ports
+(not necessarily all of them). Some ports could be used either as inputs or
+outputs (e.g. the "hidden" port could be used as input to compute
+P(visible|hidden), or as output to compute E[hidden|visible]). Optimization
+was achieved independently within each module, who would be provided a
+gradient w.r.t. some of its ports (considered outputs), and asked to update
+its internal parameters and compute accodingly a gradient w.r.t. to its input
+ports.
+
+Although it worked, it had some issues:
+- The biggest problem was that as you added more ports and options to do
+different computations, the fprop method would grow and grow and become very
+difficult to write properly to handle all possible combinations of inputs /
+outputs, while remaining efficient. Hopefully this is where Theano can be a
+big help (note: a "lazy if" could be required to handle situations where the
+same port is computed in very different ways depending on what is given as
+input).
+- We had to introduce a notion of 'states' that were ports whose values had to
+be computed, even if they were not asked by the user. The reason was that
+those values were required to perform the optimization (bprop phase) without
+re-doing some computations. Hopefully again Theano could take care of it
+(those states were potentially confusing to the user, who had to manipulate
+them without necessarily understanding what they were for).
+
+Besides that, there are at least 3 design decisions that could be done
+differently:
+- How to connect those modules together: in those OnlineLearningModules, each
+module had no idea of who it was connected to. A higher level entity was
+responsible for grabbing the output of some module and forwarding it to its
+target destination. This is to be contrasted with the design of PLearn
+Variables, where each variable was explicitely constructed given its input
+variables (Theano-like), and would directly ask them to provide data. I am not
+sure what are the pros vs. cons of these two approaches.
+- How to perform optimization. The OnlineLearningModule way is nice to plug
+together pieces that are optimized very differently, because each module is
+responsible for its own optimizatin. However, this also means it is difficult
+to easily try different global optimizers (again, this is in contrast with
+PLearn variables).
+- One must think about the issue of RNG for stochastic modules. Here we had
+one single RNG per module. This makes it diffiult to easily try different
+seeds for everyone. On another hand, sharing a single RNG is not neceassarily
+a good idea because of potentially unwanted side-effects.
+