pylearn: doc/v2_planning/arch_src/plugin_JB_comments

comparison doc/v2_planning/arch_src/plugin_JB_comments_YB.txt @ 1247:8dfe9d6e72f6

plugin_JB replies

author	James Bergstra <bergstrj@iro.umontreal.ca>
date	Thu, 23 Sep 2010 13:20:19 -0400
parents	808e38dce8d6
children	ab1db1837e98

comparison

equal deleted inserted replaced

-:14444845989a
+:8dfe9d6e72f6
 Disadvantages:
 * much more difficult to read
 * much more difficult to debug
+JB asks: I would like to try and correct you, but I don't know where to begin --
+- What do you think is more difficult to read [than what?] and why?
+- What do you expect to be more difficult [than what?] to debug?
 Advantages:
 * easier to serialize (can't we serialize an ordinary Python class created by a normal user?)
 * possible but not easier to programmatically modify existing learning algorithms
 (why not the usual constructor parameters and hooks,
 when possible, and just create another code for a new DBN variant when it can't fit?)
 * am I missing something?
+JB replies:
+- Re serializibility - I think any system that supports program pausing,
+resuming, and dynamic editing (a.k.a. process surgery) will have the flavour
+of my  proposal.  If someone has a better idea, he should suggest it.
+- Re hooks & constructors - the mechanism I propose is more flexible than hooks and constructor
+parameters.  Hooks and constructor parameters have their place, and would be
+used under my proposal as usual to configure the modules on which the
+flow-graph operates.  But flow-graphs are more flexible. Flow-graphs
+(REPEAT, CALL, etc.) that are constructed by library calls can be directly
+modified.  You can add new hooks, for example, or add a print statement
+between two statements (CALLs) that previously had no hook between them.
+- the analagous thing using the real python VM would be to dynamically
+re-program Python's compiled bytecode, which I don't think is possible.
 I am not convinced that any of the stated advantages can't be achieved in more traditional ways.
 RP comment: James or anybody else correct me if I'm wrong. What I think James
 proposed is just a way encapsulating different steps of the program in some
 want to use everywhere. As far as serialization is concerned, I think this
 should be do-able without such a system (provided we all agree that we do not
 necessarily require the ability to serialize / restart at any point). About
 the ability to move / substitute things, you could probably achieve the same
 goal with proper code factorization / conventions.
+JB replies:
+You are right that with sufficient discipline on everyone's part,
+and a good design using existing python control flow (loops and functions) it is
+probably possible to get many of the features I'm claiming with my proposal.
+But I don't think Python offers a very helpful syntax or control flow elements
+for programming parallel distributed computations through, because the python
+interpreter doesn't do that.
+What I'm trying to design is a mechanism that can allow us to *express the entire
+learning algorithm* in a program.  That means
+- including the grid-search,
+- including the use of the cluster,
+- including the pre-processing and post-processing.
+To make that actually work, programs need to be more flexible - we need to be
+able to pause and resume 'function calls', and to possibly restart them if we
+find a problem (without having to restart the whole program).  We already do
+these things in ad-hoc ways by writing scripts, generating intermediate files,
+etc., but I think we would empower ourselves by using a tool that lets us
+actually write down the *WHOLE* algorithm, in one place rather than as a README
+with a list of scripts and instructions for what to do with them (especially
+because the README often never gets written).

Mercurial > pylearn

comparison doc/v2_planning/arch_src/plugin_JB_comments_YB.txt @ 1247:8dfe9d6e72f6