# HG changeset patch # User Olivier Delalleau # Date 1285262945 14400 # Node ID fda31afc0df6f3200c0f4f029ab107d1ed8d8e82 # Parent b9d0a326e3e7a9049a3440d12c4caba9eaaa04a5# Parent 8dfe9d6e72f61a79f322caed7c2e7d9f3043a7d0 Merged diff -r b9d0a326e3e7 -r fda31afc0df6 doc/v2_planning/arch_src/plugin_JB_comments_YB.txt --- a/doc/v2_planning/arch_src/plugin_JB_comments_YB.txt Thu Sep 23 13:20:08 2010 -0400 +++ b/doc/v2_planning/arch_src/plugin_JB_comments_YB.txt Thu Sep 23 13:29:05 2010 -0400 @@ -13,6 +13,11 @@ * much more difficult to read * much more difficult to debug +JB asks: I would like to try and correct you, but I don't know where to begin -- + - What do you think is more difficult to read [than what?] and why? + - What do you expect to be more difficult [than what?] to debug? + + Advantages: * easier to serialize (can't we serialize an ordinary Python class created by a normal user?) @@ -21,6 +26,21 @@ when possible, and just create another code for a new DBN variant when it can't fit?) * am I missing something? +JB replies: + - Re serializibility - I think any system that supports program pausing, + resuming, and dynamic editing (a.k.a. process surgery) will have the flavour + of my proposal. If someone has a better idea, he should suggest it. + + - Re hooks & constructors - the mechanism I propose is more flexible than hooks and constructor + parameters. Hooks and constructor parameters have their place, and would be + used under my proposal as usual to configure the modules on which the + flow-graph operates. But flow-graphs are more flexible. Flow-graphs + (REPEAT, CALL, etc.) that are constructed by library calls can be directly + modified. You can add new hooks, for example, or add a print statement + between two statements (CALLs) that previously had no hook between them. + - the analagous thing using the real python VM would be to dynamically + re-program Python's compiled bytecode, which I don't think is possible. + I am not convinced that any of the stated advantages can't be achieved in more traditional ways. RP comment: James or anybody else correct me if I'm wrong. What I think James @@ -55,3 +75,28 @@ necessarily require the ability to serialize / restart at any point). About the ability to move / substitute things, you could probably achieve the same goal with proper code factorization / conventions. + +JB replies: + You are right that with sufficient discipline on everyone's part, + and a good design using existing python control flow (loops and functions) it is + probably possible to get many of the features I'm claiming with my proposal. + + But I don't think Python offers a very helpful syntax or control flow elements + for programming parallel distributed computations through, because the python + interpreter doesn't do that. + + What I'm trying to design is a mechanism that can allow us to *express the entire + learning algorithm* in a program. That means + - including the grid-search, + - including the use of the cluster, + - including the pre-processing and post-processing. + + To make that actually work, programs need to be more flexible - we need to be + able to pause and resume 'function calls', and to possibly restart them if we + find a problem (without having to restart the whole program). We already do + these things in ad-hoc ways by writing scripts, generating intermediate files, + etc., but I think we would empower ourselves by using a tool that lets us + actually write down the *WHOLE* algorithm, in one place rather than as a README + with a list of scripts and instructions for what to do with them (especially + because the README often never gets written). + diff -r b9d0a326e3e7 -r fda31afc0df6 doc/v2_planning/code_review.txt --- a/doc/v2_planning/code_review.txt Thu Sep 23 13:20:08 2010 -0400 +++ b/doc/v2_planning/code_review.txt Thu Sep 23 13:29:05 2010 -0400 @@ -12,18 +12,19 @@ - make a list of point to compare tools - review interresting projects -- make a decission +- make a politic of review(who,what,what,how) +- make a decission on projects Some system that we should check: --------------------------------- -- `rietveld ` Made by Guido van Rossum -- `Gerrit ` -- `track PeerReviewPlugin ` Could be integrated with the current ticket system? +- `rietveld ` Made by Guido van Rossum, seam basic and svn only +- `Gerrit `, git only +- *`Review Board `_ +- *`Code Striker `, hg added? David told in May 2009 it can do it easily. +- *`Code Review plugins in Redmine ` +- `track PeerReviewPlugin ` Could be integrated with the current ticket system?, not maintained, review code in general, not commit. - `feature request at assembla ` -- `Review Board `_ -- `reviewboard ` -- `Code Striker ` - `JCR ` What we could want from our code review @@ -31,7 +32,13 @@ - integrate with our ticket system? - Should we keep our current ticket system? -- work with mercurial +- work with mercurial, git? +- check each commit of theano/pylearn +- check experimental repository code when asked +- how show diff? patch? syntax highlight as vimdiff? +- If we commit something that is disabled by default and not fully working, we can say it in the commit message to have a faster review(only check that by default it is disabled). Then we should say in the commit message when it is ready for a full review. +- Review should be done by everybody. +- Who choose the reviewer(random, commiter)? pool of reviewers? pool level 1,2,3 where 1 is everybody with commit right. pool for specific topic(gpu, ML algo, ...)? Doc on code review ------------------ @@ -40,11 +47,35 @@ - http://ostatic.com/blog/open-source-code-review-tools - http://en.wikipedia.org/wiki/Code_review -Alternative to code review --------------------------- +Type of code review +------------------- +- Formal review - Many person review together each line of the program. - Over-the-shoulder – One developer looks over the author's shoulder as the latter walks through the code. - Email pass-around – Source code management system emails code to reviewers automatically after checkin is made. - Pair Programming – Two authors develop code together at the same workstation, such is common in Extreme Programming. - Tool-assisted code review – Authors and reviewers use specialized tools designed for peer code review. -- Test-Driven development +- Alternative: Test-Driven development +- Automatic review: use tool as pylint, pyflakes, pychecker. Don't check everything. + +We seam to do Over-the-shoulder, email and variant of pair programming from time to time. Some people read rapidly the commit of Theano and Pylearn. + +Reason for the code review +-------------------------- + +- We want at least 2 people to read all code. That mean we need a reviewer +- This help to find better solution to problem +- This help to train people on our tools ans framework. + +Check list for review +--------------------- + +- Is their tests and do they test all case? +- Is their documentation in the file? + - Do this need doc in the html doc? +- Is the addition well integrated into our framework +- Is the code well placed in the right files and right place in them? +- Try to don't duplicate code +- Is the code clear/comprehensible +- Are the comment describing what is being done? +- Answer question by de commiter, this can also serve to train people