Mercurial > pylearn
comparison doc/v2_planning/architecture_discussion.txt @ 1260:a565c20a39d7
general file to talk about the different approaches
author | Razvan Pascanu <r.pascanu@gmail.com> |
---|---|
date | Sun, 26 Sep 2010 14:10:33 -0400 |
parents | |
children | 93e1c7c9172b |
comparison
equal
deleted
inserted
replaced
1259:6f76ecef869e | 1260:a565c20a39d7 |
---|---|
1 Arnaud: | |
2 | |
3 From what I recall for the meeting last Friday, we saw three | |
4 propositions for a runtime architecture for the experiments in | |
5 pylearn. | |
6 | |
7 The thing I noticed was that none of the three propositions was | |
8 addressing the same problem. So not only do we have to choose which | |
9 one(s) we want, but we also have to decide upon what do we need. | |
10 | |
11 The proposals and the problems they address are outlined below, please | |
12 comment if you see inaccuracies: | |
13 | |
14 - PL's proposal, the hooks thing, was about enabling hooks to be | |
15 registered at predefined points in functions and giving them access to | |
16 the local variables. This addresses nicely the problem of collecting | |
17 stats and printing progress. | |
18 | |
19 - OB's proposal, the checkpoints thing, was about enabling the saving | |
20 and loading of state at predefined points in the function. Other | |
21 actions could also be performed at these points. | |
22 | |
23 - JB's proposal, the new language thing, was about expressing | |
24 algorithms with a control structure made of classes so that its state | |
25 and structure could be preserved. It could also define new control | |
26 structure to run things in parallel, over multiple machines or not. | |
27 | |
28 Razvan: | |
29 | |
30 I would add the following observations: | |
31 | |
32 #1 | |
33 --- | |
34 | |
35 This might be an artificial created issue, but I will write it down anyhow. | |
36 We can decide later if we care about it. | |
37 | |
38 Imagine you have some function provided by the library that implements | |
39 some (complicated) pattern. Let say deeplearning ( the pretraining followed | |
40 by finetuning). You instantiate this somehow : | |
41 | |
42 instance = deeplearning(..) | |
43 | |
44 Now you want to add some function to a given hook, checkpoint or whatever | |
45 to calculate some statistics. You of course can do that ( the documentation | |
46 can tell you how those hooks are named), but what the function will get is | |
47 the locals defined in deeplearning. So you need to open up the file that | |
48 implements that deeplearning and understand the code to figure out what | |
49 variable does what. | |
50 | |
51 Secondly if you need to execute a function in a unforseen place by the | |
52 deeplearning,you can only do that by hacking the file implementing | |
53 deeplearning function, i.e. by hacking the library. One can make sure that | |
54 does not happen by overpopulating the code with hooks, but then we need | |
55 a name for each hook. | |
56 | |
57 I can add that probably in most cases the logic that goes into this is | |
58 simple enough that the issues above are insignificant, but I might be wrong. | |
59 | |
60 | |
61 #2 | |
62 --- | |
63 | |
64 I think it is much healthier to think of James proposal as a glorified | |
65 pipeline and not as a new language. You have components that you add in | |
66 you pipeline. A CALL is such a component. You run the program by executing | |
67 the pipeline ( which goes from one component to the other and calls it) | |
68 | |
69 We are dealing with a glorified pipeline because : | |
70 - when running the pipeline you can loop over a certain segment of the | |
71 pipeline if you need to | |
72 - you can, at run time, swtich between two possible terminations of the | |
73 pipeline (the if command) | |
74 - you can have two pipelines running in paralel, by running one | |
75 component from one pipeline and then going to the other | |
76 | |
77 You can also think of what James proposes as sort of the same as | |
78 Olivier's with the following differences: | |
79 - Olivier makes this entire mechanism invisible to the eye while in | |
80 James' case it is explicit | |
81 - James has inplicit checkpoints between any component, in Olivier's | |
82 case you can define pipelines at different points ( maybe even more | |
83 finely grained that what James mechanism offers) | |
84 - One can imagine how, though Olivier did not exactly explained | |
85 how you could have hooks in a template such that you do not actually need | |
86 to hack that code. | |
87 | |
88 James proposal also offers a way of expressing the distributed part in | |
89 your main program. Is the same as having two pipelines between which you | |
90 switch. Just think now each pipeline runs on a different machine | |
91 independently and you just wait as the server for them to return. This | |
92 is just one possibility. | |
93 | |
94 In this proposal you can also see how you would solve the unforseen hooks | |
95 problem, by having a special function that could alter the pipeline in some | |
96 way (for example by introducing new components). | |
97 |