pylearn: doc/v2_planning/neural

annotate doc/v2_planning/neural_net.txt @ 1215:4754661ad6ab

reply to plugin_JB_comments_IG

author	James Bergstra <bergstrj@iro.umontreal.ca>
date	Wed, 22 Sep 2010 11:58:04 -0400
parents	0e12ea6ba661
children

rev	line source
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	1 Neural Net committee
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	2 ====================
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	3
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	4 Members:
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	5 - Razvan Pascanu
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	6 - James Bergstra
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	7 - Xavier Glorot
1099 0b666177f725 added myself to committe gdesjardins parents: 1092 diff changeset	8 - Guillaume Desjardins
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	9
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	10 (Add your name here if you want)
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	11
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	12
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	13 Objective ( Razvan)
1189 0e12ea6ba661 fix many rst syntax error warning. Frederic Bastien <nouiz@nouiz.org> parents: 1099 diff changeset	14 -------------------
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	15
1090 a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	16 Come up with a description of how to write learners ( how to combine
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	17 optimizer, structure, error measure, how to talk to datasets, tasks ( if there
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	18 is anything like a dataset object in your view) and so on).
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	19 o The way I see it personaly, we should pick "random" interfaces for any component
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	20 for which there is no one yet, or change the interface to answer our needs.
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	21 If our description of how these things get together. I would say come up with
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	22 pseudo-code for some tasks ( that vary as much as possible) + text describing
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	23 all the missing details.
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	24
1092 aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	25 Link with PLearn
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	26 ----------------
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	27
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	28 OD: This is basically what the OnlineLearningModule framework was doing in
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	29 PLearn (c.f. PLearn/plearn_learners/online). Basically, the idea was that a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	30 module was a "box" with so-called "ports" representing inputs / outputs. So
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	31 for instance you could think of an RBM as a module with "visible" and
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	32 "hidden" ports, but also "log_p_visible", "energy", etc. You would use
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	33 such a module by calling an fprop method where you would give some values for
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	34 input ports (not necessarily all of them), and would ask some output ports
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	35 (not necessarily all of them). Some ports could be used either as inputs or
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	36 outputs (e.g. the "hidden" port could be used as input to compute
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	37 P(visible\|hidden), or as output to compute E[hidden\|visible]). Optimization
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	38 was achieved independently within each module, who would be provided a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	39 gradient w.r.t. some of its ports (considered outputs), and asked to update
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	40 its internal parameters and compute accodingly a gradient w.r.t. to its input
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	41 ports.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	42
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	43 Although it worked, it had some issues:
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	44 - The biggest problem was that as you added more ports and options to do
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	45 different computations, the fprop method would grow and grow and become very
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	46 difficult to write properly to handle all possible combinations of inputs /
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	47 outputs, while remaining efficient. Hopefully this is where Theano can be a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	48 big help (note: a "lazy if" could be required to handle situations where the
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	49 same port is computed in very different ways depending on what is given as
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	50 input).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	51 - We had to introduce a notion of 'states' that were ports whose values had to
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	52 be computed, even if they were not asked by the user. The reason was that
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	53 those values were required to perform the optimization (bprop phase) without
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	54 re-doing some computations. Hopefully again Theano could take care of it
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	55 (those states were potentially confusing to the user, who had to manipulate
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	56 them without necessarily understanding what they were for).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	57
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	58 Besides that, there are at least 3 design decisions that could be done
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	59 differently:
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	60 - How to connect those modules together: in those OnlineLearningModules, each
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	61 module had no idea of who it was connected to. A higher level entity was
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	62 responsible for grabbing the output of some module and forwarding it to its
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	63 target destination. This is to be contrasted with the design of PLearn
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	64 Variables, where each variable was explicitely constructed given its input
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	65 variables (Theano-like), and would directly ask them to provide data. I am not
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	66 sure what are the pros vs. cons of these two approaches.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	67 - How to perform optimization. The OnlineLearningModule way is nice to plug
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	68 together pieces that are optimized very differently, because each module is
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	69 responsible for its own optimizatin. However, this also means it is difficult
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	70 to easily try different global optimizers (again, this is in contrast with
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	71 PLearn variables).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	72 - One must think about the issue of RNG for stochastic modules. Here we had
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	73 one single RNG per module. This makes it diffiult to easily try different
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	74 seeds for everyone. On another hand, sharing a single RNG is not neceassarily
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	75 a good idea because of potentially unwanted side-effects.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	76

Mercurial > pylearn

annotate doc/v2_planning/neural_net.txt @ 1215:4754661ad6ab