comparison doc/v2_planning/neural_net.txt @ 1092:aab9c261361c

neural_net: Added info about how PLearn was doing it
author Olivier Delalleau <delallea@iro>
date Sun, 12 Sep 2010 15:12:19 -0400
parents a80b296eb0df
children 0b666177f725
comparison
equal deleted inserted replaced
1091:319de699fb67 1092:aab9c261361c
19 for which there is no one yet, or change the interface to answer our needs. 19 for which there is no one yet, or change the interface to answer our needs.
20 If our description of how these things get together. I would say come up with 20 If our description of how these things get together. I would say come up with
21 pseudo-code for some tasks ( that vary as much as possible) + text describing 21 pseudo-code for some tasks ( that vary as much as possible) + text describing
22 all the missing details. 22 all the missing details.
23 23
24 Link with PLearn
25 ----------------
26
27 OD: This is basically what the OnlineLearningModule framework was doing in
28 PLearn (c.f. PLearn/plearn_learners/online). Basically, the idea was that a
29 module was a "box" with so-called "ports" representing inputs / outputs. So
30 for instance you could think of an RBM as a module with "visible" and
31 "hidden" ports, but also "log_p_visible", "energy", etc. You would use
32 such a module by calling an fprop method where you would give some values for
33 input ports (not necessarily all of them), and would ask some output ports
34 (not necessarily all of them). Some ports could be used either as inputs or
35 outputs (e.g. the "hidden" port could be used as input to compute
36 P(visible|hidden), or as output to compute E[hidden|visible]). Optimization
37 was achieved independently within each module, who would be provided a
38 gradient w.r.t. some of its ports (considered outputs), and asked to update
39 its internal parameters and compute accodingly a gradient w.r.t. to its input
40 ports.
41
42 Although it worked, it had some issues:
43 - The biggest problem was that as you added more ports and options to do
44 different computations, the fprop method would grow and grow and become very
45 difficult to write properly to handle all possible combinations of inputs /
46 outputs, while remaining efficient. Hopefully this is where Theano can be a
47 big help (note: a "lazy if" could be required to handle situations where the
48 same port is computed in very different ways depending on what is given as
49 input).
50 - We had to introduce a notion of 'states' that were ports whose values had to
51 be computed, even if they were not asked by the user. The reason was that
52 those values were required to perform the optimization (bprop phase) without
53 re-doing some computations. Hopefully again Theano could take care of it
54 (those states were potentially confusing to the user, who had to manipulate
55 them without necessarily understanding what they were for).
56
57 Besides that, there are at least 3 design decisions that could be done
58 differently:
59 - How to connect those modules together: in those OnlineLearningModules, each
60 module had no idea of who it was connected to. A higher level entity was
61 responsible for grabbing the output of some module and forwarding it to its
62 target destination. This is to be contrasted with the design of PLearn
63 Variables, where each variable was explicitely constructed given its input
64 variables (Theano-like), and would directly ask them to provide data. I am not
65 sure what are the pros vs. cons of these two approaches.
66 - How to perform optimization. The OnlineLearningModule way is nice to plug
67 together pieces that are optimized very differently, because each module is
68 responsible for its own optimizatin. However, this also means it is difficult
69 to easily try different global optimizers (again, this is in contrast with
70 PLearn variables).
71 - One must think about the issue of RNG for stochastic modules. Here we had
72 one single RNG per module. This makes it diffiult to easily try different
73 seeds for everyone. On another hand, sharing a single RNG is not neceassarily
74 a good idea because of potentially unwanted side-effects.
75