pylearn: doc/v2_planning/neural

annotate doc/v2_planning/neural_net.txt @ 1092:aab9c261361c

neural_net: Added info about how PLearn was doing it

author	Olivier Delalleau <delallea@iro>
date	Sun, 12 Sep 2010 15:12:19 -0400
parents	a80b296eb0df
children	0b666177f725

rev	line source
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	1 Neural Net committee
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	2 ====================
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	3
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	4 Members:
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	5 - Razvan Pascanu
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	6 - James Bergstra
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	7 - Xavier Glorot
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	8
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	9 (Add your name here if you want)
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	10
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	11
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	12 Objective ( Razvan)
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	13 ---------
e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	14
1090 a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	15 Come up with a description of how to write learners ( how to combine
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	16 optimizer, structure, error measure, how to talk to datasets, tasks ( if there
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	17 is anything like a dataset object in your view) and so on).
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	18 o The way I see it personaly, we should pick "random" interfaces for any component
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	19 for which there is no one yet, or change the interface to answer our needs.
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	20 If our description of how these things get together. I would say come up with
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	21 pseudo-code for some tasks ( that vary as much as possible) + text describing
a80b296eb0df I removed big picture from the description of the neural network committee Razvan Pascanu <r.pascanu@gmail.com> parents: 1088 diff changeset	22 all the missing details.
1088 e254065e7fd7 File for the new committee neural networks Razvan Pascanu <r.pascanu@gmail.com> parents: diff changeset	23
1092 aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	24 Link with PLearn
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	25 ----------------
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	26
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	27 OD: This is basically what the OnlineLearningModule framework was doing in
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	28 PLearn (c.f. PLearn/plearn_learners/online). Basically, the idea was that a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	29 module was a "box" with so-called "ports" representing inputs / outputs. So
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	30 for instance you could think of an RBM as a module with "visible" and
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	31 "hidden" ports, but also "log_p_visible", "energy", etc. You would use
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	32 such a module by calling an fprop method where you would give some values for
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	33 input ports (not necessarily all of them), and would ask some output ports
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	34 (not necessarily all of them). Some ports could be used either as inputs or
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	35 outputs (e.g. the "hidden" port could be used as input to compute
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	36 P(visible\|hidden), or as output to compute E[hidden\|visible]). Optimization
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	37 was achieved independently within each module, who would be provided a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	38 gradient w.r.t. some of its ports (considered outputs), and asked to update
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	39 its internal parameters and compute accodingly a gradient w.r.t. to its input
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	40 ports.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	41
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	42 Although it worked, it had some issues:
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	43 - The biggest problem was that as you added more ports and options to do
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	44 different computations, the fprop method would grow and grow and become very
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	45 difficult to write properly to handle all possible combinations of inputs /
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	46 outputs, while remaining efficient. Hopefully this is where Theano can be a
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	47 big help (note: a "lazy if" could be required to handle situations where the
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	48 same port is computed in very different ways depending on what is given as
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	49 input).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	50 - We had to introduce a notion of 'states' that were ports whose values had to
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	51 be computed, even if they were not asked by the user. The reason was that
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	52 those values were required to perform the optimization (bprop phase) without
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	53 re-doing some computations. Hopefully again Theano could take care of it
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	54 (those states were potentially confusing to the user, who had to manipulate
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	55 them without necessarily understanding what they were for).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	56
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	57 Besides that, there are at least 3 design decisions that could be done
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	58 differently:
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	59 - How to connect those modules together: in those OnlineLearningModules, each
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	60 module had no idea of who it was connected to. A higher level entity was
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	61 responsible for grabbing the output of some module and forwarding it to its
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	62 target destination. This is to be contrasted with the design of PLearn
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	63 Variables, where each variable was explicitely constructed given its input
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	64 variables (Theano-like), and would directly ask them to provide data. I am not
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	65 sure what are the pros vs. cons of these two approaches.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	66 - How to perform optimization. The OnlineLearningModule way is nice to plug
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	67 together pieces that are optimized very differently, because each module is
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	68 responsible for its own optimizatin. However, this also means it is difficult
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	69 to easily try different global optimizers (again, this is in contrast with
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	70 PLearn variables).
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	71 - One must think about the issue of RNG for stochastic modules. Here we had
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	72 one single RNG per module. This makes it diffiult to easily try different
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	73 seeds for everyone. On another hand, sharing a single RNG is not neceassarily
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	74 a good idea because of potentially unwanted side-effects.
aab9c261361c neural_net: Added info about how PLearn was doing it Olivier Delalleau <delallea@iro> parents: 1090 diff changeset	75

Mercurial > pylearn

annotate doc/v2_planning/neural_net.txt @ 1092:aab9c261361c