Mercurial > pylearn

--- a/doc/v2_planning/arch_src/plugin_JB_comments_YB.txt	Thu Sep 23 13:20:08 2010 -0400
+++ b/doc/v2_planning/arch_src/plugin_JB_comments_YB.txt	Thu Sep 23 13:29:05 2010 -0400
@@ -13,6 +13,11 @@
 * much more difficult to read
 * much more difficult to debug

+JB asks: I would like to try and correct you, but I don't know where to begin --
+  - What do you think is more difficult to read [than what?] and why?
+  - What do you expect to be more difficult [than what?] to debug?
+
+
 Advantages:

 * easier to serialize (can't we serialize an ordinary Python class created by a normal user?)
@@ -21,6 +26,21 @@
    when possible, and just create another code for a new DBN variant when it can't fit?)
 * am I missing something?

+JB replies:
+  - Re serializibility - I think any system that supports program pausing,
+    resuming, and dynamic editing (a.k.a. process surgery) will have the flavour
+    of my  proposal.  If someone has a better idea, he should suggest it.
+
+  - Re hooks & constructors - the mechanism I propose is more flexible than hooks and constructor
+    parameters.  Hooks and constructor parameters have their place, and would be
+    used under my proposal as usual to configure the modules on which the
+    flow-graph operates.  But flow-graphs are more flexible. Flow-graphs
+    (REPEAT, CALL, etc.) that are constructed by library calls can be directly
+    modified.  You can add new hooks, for example, or add a print statement
+    between two statements (CALLs) that previously had no hook between them.
+    - the analagous thing using the real python VM would be to dynamically
+      re-program Python's compiled bytecode, which I don't think is possible.
+
 I am not convinced that any of the stated advantages can't be achieved in more traditional ways.

 RP comment: James or anybody else correct me if I'm wrong. What I think James
@@ -55,3 +75,28 @@
 necessarily require the ability to serialize / restart at any point). About
 the ability to move / substitute things, you could probably achieve the same
 goal with proper code factorization / conventions.
+
+JB replies:
+  You are right that with sufficient discipline on everyone's part,
+  and a good design using existing python control flow (loops and functions) it is
+  probably possible to get many of the features I'm claiming with my proposal.
+
+  But I don't think Python offers a very helpful syntax or control flow elements
+  for programming parallel distributed computations through, because the python
+  interpreter doesn't do that.
+
+  What I'm trying to design is a mechanism that can allow us to *express the entire
+  learning algorithm* in a program.  That means
+  - including the grid-search,
+  - including the use of the cluster,
+  - including the pre-processing and post-processing.
+
+  To make that actually work, programs need to be more flexible - we need to be
+  able to pause and resume 'function calls', and to possibly restart them if we
+  find a problem (without having to restart the whole program).  We already do
+  these things in ad-hoc ways by writing scripts, generating intermediate files,
+  etc., but I think we would empower ourselves by using a tool that lets us
+  actually write down the *WHOLE* algorithm, in one place rather than as a README
+  with a list of scripts and instructions for what to do with them (especially
+  because the README often never gets written).
+
--- a/doc/v2_planning/code_review.txt	Thu Sep 23 13:20:08 2010 -0400
+++ b/doc/v2_planning/code_review.txt	Thu Sep 23 13:29:05 2010 -0400
@@ -12,18 +12,19 @@

 - make a list of point to compare tools
 - review interresting projects
-- make a decission
+- make a politic of review(who,what,what,how)
+- make a decission on projects

 Some system that we should check:
 ---------------------------------

-- `rietveld <http://code.google.com/p/rietveld/>` Made by Guido van Rossum
-- `Gerrit <http://code.google.com/p/gerrit/>`
-- `track PeerReviewPlugin <http://trac-hacks.org/wiki/PeerReviewPlugin>` Could be integrated with the current ticket system?
+- `rietveld <http://code.google.com/p/rietveld/>` Made by Guido van Rossum, seam basic and svn only
+- `Gerrit <http://code.google.com/p/gerrit/>`, git only
+- *`Review Board <http://www.reviewboard.org>`_
+- *`Code Striker <http://codestriker.sourceforge.net/>`, hg added? David told in May 2009 it can do it easily.
+- *`Code Review plugins in Redmine <http://www.redmine.org/boards/3/topics/9627>`
+- `track PeerReviewPlugin <http://trac-hacks.org/wiki/PeerReviewPlugin>` Could be integrated with the current ticket system?, not maintained, review code in general, not commit.
 - `feature request at assembla <http://feedback.assembla.com/forums/5433-feature-requests/suggestions/253297-add-a-code-review-tool-e-g-reviewboard->`
-- `Review Board <http://www.reviewboard.org>`_
-- `reviewboard <http://code.google.com/p/reviewboard/>`
-- `Code Striker <http://codestriker.sourceforge.net/>`
 - `JCR <http://jcodereview.sourceforge.net/>`

 What we could want from our code review
@@ -31,7 +32,13 @@

 - integrate with our ticket system?
     - Should we keep our current ticket system?
-- work with mercurial
+- work with mercurial, git?
+- check each commit of theano/pylearn
+- check experimental repository code when asked
+- how show diff? patch? syntax highlight as vimdiff?
+- If we commit something that is disabled by default and not fully working, we can say it in the commit message to have a faster review(only check that by default it is disabled). Then we should say in the commit message when it is ready for a full review.
+- Review should be done by everybody.
+- Who choose the reviewer(random, commiter)? pool of reviewers? pool level 1,2,3 where 1 is everybody with commit right. pool for specific topic(gpu, ML algo, ...)?

 Doc on code review
 ------------------
@@ -40,11 +47,35 @@
 - http://ostatic.com/blog/open-source-code-review-tools
 - http://en.wikipedia.org/wiki/Code_review

-Alternative to code review
---------------------------
+Type of code review
+-------------------

+- Formal review - Many person review together each line of the program.
 - Over-the-shoulder – One developer looks over the author's shoulder as the latter walks through the code.
 - Email pass-around – Source code management system emails code to reviewers automatically after checkin is made.
 - Pair Programming – Two authors develop code together at the same workstation, such is common in Extreme Programming.
 - Tool-assisted code review – Authors and reviewers use specialized tools designed for peer code review.
-- Test-Driven development
+- Alternative: Test-Driven development
+- Automatic review: use tool as pylint, pyflakes, pychecker. Don't check everything.
+
+We seam to do Over-the-shoulder, email and variant of pair programming from time to time. Some people read rapidly the commit of Theano and Pylearn.
+
+Reason for the code review
+--------------------------
+
+- We want at least 2 people to read all code. That mean we need a reviewer
+- This help to find better solution to problem
+- This help to train people on our tools ans framework.
+
+Check list for review
+---------------------
+
+- Is their tests and do they test all case?
+- Is their documentation in the file?
+    - Do this need doc in the html doc?
+- Is the addition well integrated into our framework
+- Is the code well placed in the right files and right place in them?
+- Try to don't duplicate code
+- Is the code clear/comprehensible
+- Are the comment describing what is being done?
+- Answer question by de commiter, this can also serve to train people