pylearn: doc/v2_planning/architecture_discussion.txt comparison

comparison doc/v2_planning/architecture_discussion.txt @ 1260:a565c20a39d7

general file to talk about the different approaches

author	Razvan Pascanu <r.pascanu@gmail.com>
date	Sun, 26 Sep 2010 14:10:33 -0400
parents
children	93e1c7c9172b

comparison

equal deleted inserted replaced

-:6f76ecef869e
+:a565c20a39d7
+Arnaud:
+From what I recall for the meeting last Friday, we saw three
+propositions for a runtime architecture for the experiments in
+pylearn.
+The thing I noticed was that none of the three propositions was
+addressing the same problem.  So not only do we have to choose which
+one(s) we want, but we also have to decide upon what do we need.
+The proposals and the problems they address are outlined below, please
+comment if you see inaccuracies:
+- PL's proposal, the hooks thing, was about enabling hooks to be
+registered at predefined points in functions and giving them access to
+the local variables.  This addresses nicely the problem of collecting
+stats and printing progress.
+- OB's proposal, the checkpoints thing, was about enabling the saving
+and loading of state at predefined points in the function.  Other
+actions could also be performed at these points.
+- JB's proposal, the new language thing, was about expressing
+algorithms with a control structure made of classes so that its state
+and structure could be preserved.  It could also define new control
+structure to run things in parallel, over multiple machines or not.
+Razvan:
+I would add the following observations:
+#1
+---
+This might be an artificial created issue, but I will write it down anyhow.
+We can decide later if we care about it.
+Imagine you have some function provided by the library that implements
+some (complicated) pattern. Let say deeplearning ( the pretraining followed
+by finetuning). You instantiate this somehow :
+instance = deeplearning(..)
+Now you want to add some function to a given hook, checkpoint or whatever
+to calculate some statistics. You of course can do that ( the documentation
+can tell you how those hooks are named), but what the function will get is
+the locals defined in deeplearning. So you need to open up the file that
+implements that deeplearning and understand the code to figure out what
+variable does what.
+Secondly if you need to execute a function in a unforseen place by the
+deeplearning,you can only do that by hacking the file implementing
+deeplearning function, i.e. by hacking the library. One can make sure that
+does not happen by overpopulating the code with hooks, but then we need
+a name for each hook.
+I can add that probably in most cases the logic that goes into this is
+simple enough that the issues above are insignificant, but I might be wrong.
+#2
+---
+I think it is much healthier to think of James proposal as a glorified
+pipeline and not as a new language. You have components that you add in
+you pipeline. A CALL is such a component. You run the program by executing
+the pipeline ( which goes from one component to the other and calls it)
+We are dealing with a glorified pipeline because :
+- when running the pipeline you can loop over a certain segment of the
+pipeline if you need to
+- you can, at run time, swtich between two possible terminations of the
+pipeline  (the if command)
+- you can have two pipelines running in paralel, by running one
+component from one pipeline and then going to the other
+You can also think of what James proposes as sort of the same as
+Olivier's with the following differences:
+- Olivier makes this entire mechanism invisible to the eye while in
+James' case it is explicit
+- James has inplicit checkpoints between any component, in Olivier's
+case you can define pipelines at different points ( maybe even more
+finely grained that what James mechanism offers)
+- One can imagine how, though Olivier did not exactly explained
+how you could have hooks in a template such that you do not actually need
+to hack that code.
+James proposal also offers a way of expressing the distributed part in
+your main program. Is the same as having two pipelines between which you
+switch. Just think now each pipeline runs on a different machine
+independently and you just wait as the server for them to return. This
+is just one possibility.
+In this proposal you can also see how you would solve the unforseen hooks
+problem, by having a special function that could alter the pipeline in some
+way (for example by introducing new components).

Mercurial > pylearn

comparison doc/v2_planning/architecture_discussion.txt @ 1260:a565c20a39d7