annotate doc/v2_planning/arch_src/plugin_JB_comments_YB.txt @ 1419:cff305ad9f60

TensorFnDataset - added x_ attribute that caches the dataset function return value, but does not get pickled.
author James Bergstra <bergstrj@iro.umontreal.ca>
date Fri, 04 Feb 2011 16:05:22 -0500
parents 4a1339682c8f
children
rev   line source
1238
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
1
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
2 YB. I am very worried about this proposal. It looks again like we would be
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
3 creating another language to replace one we already have, namely python,
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
4 mainly so that we could have introspection and programmable changes
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
5 into an existing control flow structure (e.g. the standard DBN code).
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
6
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
7 I feel that the negatives outweigh the advantages.
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
8
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
9 Please correct me:
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
10
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
11 Disadvantages:
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
12
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
13 * much more difficult to read
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
14 * much more difficult to debug
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
15
1247
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
16 JB asks: I would like to try and correct you, but I don't know where to begin --
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
17 - What do you think is more difficult to read [than what?] and why?
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
18 - What do you expect to be more difficult [than what?] to debug?
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
19
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
20
1238
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
21 Advantages:
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
22
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
23 * easier to serialize (can't we serialize an ordinary Python class created by a normal user?)
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
24 * possible but not easier to programmatically modify existing learning algorithms
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
25 (why not the usual constructor parameters and hooks,
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
26 when possible, and just create another code for a new DBN variant when it can't fit?)
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
27 * am I missing something?
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
28
1247
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
29 JB replies:
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
30 - Re serializibility - I think any system that supports program pausing,
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
31 resuming, and dynamic editing (a.k.a. process surgery) will have the flavour
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
32 of my proposal. If someone has a better idea, he should suggest it.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
33
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
34 - Re hooks & constructors - the mechanism I propose is more flexible than hooks and constructor
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
35 parameters. Hooks and constructor parameters have their place, and would be
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
36 used under my proposal as usual to configure the modules on which the
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
37 flow-graph operates. But flow-graphs are more flexible. Flow-graphs
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
38 (REPEAT, CALL, etc.) that are constructed by library calls can be directly
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
39 modified. You can add new hooks, for example, or add a print statement
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
40 between two statements (CALLs) that previously had no hook between them.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
41 - the analagous thing using the real python VM would be to dynamically
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
42 re-program Python's compiled bytecode, which I don't think is possible.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
43
1238
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
44 I am not convinced that any of the stated advantages can't be achieved in more traditional ways.
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
45
1242
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
46 RP comment: James or anybody else correct me if I'm wrong. What I think James
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
47 proposed is just a way encapsulating different steps of the program in some
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
48 classes. These classes are serializable. They are not a programming language
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
49 per se. The way I see it is like dividing your program in a set of functions.
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
50 Each function is a control flow element applied to something ( like a CALL to
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
51 a python function ). The idea is to wrap this functions around something to
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
52 make them serializable, and also offer the added advantage that you have a
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
53 graph that presents the order in which you should call the functions and you
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
54 can play with that order.
1238
067b2f9ba122 comments by YB on JB's arch_sr/plugin_JB.py
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
diff changeset
55
1242
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
56 That is why I was trying to convince James to re-write things ( using some
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
57 syntactic sugar) to make it look less intimidating ( I believe it can look
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
58 much more "traditional" that it looks right now). I think a programming
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
59 language might also be a overloaded term that so we might speak about
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
60 different things. But if all that his proposal does is to offer some wrapper
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
61 around python function that makes them serializable, and generate a execution
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
62 order graph in which you can possible do simple operations ( like
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
63 substitutions and replacements) I would not call it a programming language.
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
64
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
65 I think the advantage of making the program aware where in its own execution
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
66 flow it is and what is its execution flow can be quite useful for automating
316410a38f6f comment on Yoshua's comment on James architecture
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1238
diff changeset
67 some of the things we want.
1245
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
68
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
69 OD comments: I agree with Yoshua. I actually thought (by watching at the
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
70 discussions in these files from a rather high-level point-of-view) the main
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
71 goal of this machinery was to help with parallelization. If that is the case,
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
72 it may prove useful in some places, but it is not something that one would
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
73 want to use everywhere. As far as serialization is concerned, I think this
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
74 should be do-able without such a system (provided we all agree that we do not
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
75 necessarily require the ability to serialize / restart at any point). About
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
76 the ability to move / substitute things, you could probably achieve the same
808e38dce8d6 Replied to YB's comment on JB's system
Olivier Delalleau <delallea@iro>
parents: 1242
diff changeset
77 goal with proper code factorization / conventions.
1247
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
78
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
79 JB replies:
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
80 You are right that with sufficient discipline on everyone's part,
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
81 and a good design using existing python control flow (loops and functions) it is
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
82 probably possible to get many of the features I'm claiming with my proposal.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
83
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
84 But I don't think Python offers a very helpful syntax or control flow elements
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
85 for programming parallel distributed computations through, because the python
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
86 interpreter doesn't do that.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
87
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
88 What I'm trying to design is a mechanism that can allow us to *express the entire
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
89 learning algorithm* in a program. That means
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
90 - including the grid-search,
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
91 - including the use of the cluster,
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
92 - including the pre-processing and post-processing.
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
93
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
94 To make that actually work, programs need to be more flexible - we need to be
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
95 able to pause and resume 'function calls', and to possibly restart them if we
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
96 find a problem (without having to restart the whole program). We already do
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
97 these things in ad-hoc ways by writing scripts, generating intermediate files,
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
98 etc., but I think we would empower ourselves by using a tool that lets us
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
99 actually write down the *WHOLE* algorithm, in one place rather than as a README
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
100 with a list of scripts and instructions for what to do with them (especially
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
101 because the README often never gets written).
8dfe9d6e72f6 plugin_JB replies
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1245
diff changeset
102
1250
ab1db1837e98 JB's plugin: Reply to JB
Olivier Delalleau <delallea@iro>
parents: 1247
diff changeset
103 OD replies: I can see such a framework being useful for high-level experiment
ab1db1837e98 JB's plugin: Reply to JB
Olivier Delalleau <delallea@iro>
parents: 1247
diff changeset
104 design (the "big picture", or how to plug different components together). What
ab1db1837e98 JB's plugin: Reply to JB
Olivier Delalleau <delallea@iro>
parents: 1247
diff changeset
105 I am not convinced about is that we should also use it to write a standard
ab1db1837e98 JB's plugin: Reply to JB
Olivier Delalleau <delallea@iro>
parents: 1247
diff changeset
106 serial machine learning algorithm (e.g. DBN training with fixed
ab1db1837e98 JB's plugin: Reply to JB
Olivier Delalleau <delallea@iro>
parents: 1247
diff changeset
107 hyper-parameters).
1251
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
108
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
109 RP replies : What do you understand by writing down a DBN. I believe the
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
110 structure and so on ( selecting the optimizers) shouldn't be done using this
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
111 approach. You will start using this syntax to do early stopping, to decide the
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
112 order of pre-training the layers. In my view you get something like
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
113 pretrain_layer1, pretrain_layer2, finetune_one_step and then starting using
70ca63c05672 comment on OD's reply
Razvan Pascanu <r.pascanu@gmail.com>
parents: 1250
diff changeset
114 James framework. Are you thinking in the same terms ?
1252
4a1339682c8f Reply to RP
Olivier Delalleau <delallea@iro>
parents: 1251
diff changeset
115
4a1339682c8f Reply to RP
Olivier Delalleau <delallea@iro>
parents: 1251
diff changeset
116 OD replies: Actually I wasn't thinking of using it at all inside a DBN's code.
4a1339682c8f Reply to RP
Olivier Delalleau <delallea@iro>
parents: 1251
diff changeset
117 I forgot early stopping for each layer's training though, and it's true it may
4a1339682c8f Reply to RP
Olivier Delalleau <delallea@iro>
parents: 1251
diff changeset
118 be useful to take advantage of some generic mechanism there... but I wouldn't
4a1339682c8f Reply to RP
Olivier Delalleau <delallea@iro>
parents: 1251
diff changeset
119 use James' framework for it.