diff doc/v2_planning/coding_style.txt @ 1103:56c5f0990869

coding_style: More work on some guidelines, also put some points to debate in a specific section
author Olivier Delalleau <delallea@iro>
date Mon, 13 Sep 2010 16:50:24 -0400
parents d422f726c156
children 60ef81fe1825
line wrap: on
line diff
--- a/doc/v2_planning/coding_style.txt	Mon Sep 13 15:03:40 2010 -0400
+++ b/doc/v2_planning/coding_style.txt	Mon Sep 13 16:50:24 2010 -0400
@@ -8,7 +8,77 @@
 - David
 - Olivier D [leader]
 
+Open for public debate
+----------------------
 
+   * Use imports for packages and modules only. I.e. avoid
+        from foo import *
+        from foo import Bar
+     OD: Overall I agree with this. However we probably want to allow some
+        exceptions, like:
+            from itertools import imap, izip
+        Also, some people may want to have shortcuts like
+            from theano import tensor as T
+        but I would prefer to forbid this. It is handy when trying stuff in
+        the interactive interpreter, but in real code it can easily get messy
+        when you want to copy / paste different pieces of code and they use
+        different conventions. Typing tensor.* is a bit longer, but a lot more
+        portable.
+
+   * Imports should usually be on separate lines.
+     OD: I would add an exception, saying it is ok to group multiple imports
+        from the standard library on a single line, e.g.
+            import os, sys, time
+        I just don't see much benefit in putting them on separate lines (for
+        third-party imports I agree it is best to keep them separate, as it
+        makes dependencies clearer, and diffs look better when someone adds /
+        removes an import).  Does anyone see a good reason to keep standard
+        library imports on different lines?
+
+    * The BDFL recommends inserting a blank line between the
+      last paragraph in a multi-line docstring and its closing quotes, placing
+      the closing quotes on a line by themselves. This way, Emacs'
+      fill-paragraph command can be used on it.
+      OD: I think it is ugly and I have not seen it used much. Any Emacs
+        user believes it is a must?
+
+    * Avoid contractions in code comments (particularly in
+      documentation): "We do not add blue to red because it does not look good"
+      rather than "We don't add blue to red because it doesn't look good".
+      OD: I mostly find it to be cleaner (been used to it while writing
+          scientific articles too).
+
+   * Imperative vs. third-person comments.
+        # Return the sum of elements in x.  <-- imperative
+        # Returns the sum of elements in x. <-- third-person
+     OD: I am used to the imperative form and like it better only because it
+         typically saves one letter (the 's') and is easier to conjugate.
+
+    * OD: I like always doing the following when subclassing
+      a class A:
+        class B(A):
+            def __init__(self, b_arg_1, b_arg_2, **kw):
+                super(B, self).__init__(**kw)
+                ...
+      The point here is that the constructor always allow for extra keyword
+      arguments (except for the class at the very top of the hierarchy), which
+      are automatically passed to the parent class.
+      Pros:
+        - You do not need to repeat the parent class arguments whenever you
+          write a new subclass.
+        - Whenever you add an argument to the parent class, all child classes
+          can benefit from it without modifying their code.
+      Cons:
+        - One needs to look at the parent classes to see what these arguments
+          are.
+        - You cannot use a **kw argument in your constructor for your own
+          selfish purpose.
+        - I have no clue whether one could do this with multiple inheritance.
+        - More?
+      Question: Should we encourage this in Pylearn?
+
+Note about warnings
+-------------------
 
 Fred: This is a refactored thing from James email of what we should put in message
 that we send to the user:
@@ -19,21 +89,24 @@
 Existing Python coding style specifications and guidelines
 ----------------------------------------------------------
 
-    * http://www.python.org/dev/peps/pep-0008/ Style Guide for Python Code
-    * http://www.python.org/dev/peps/pep-0257/ Docstring Conventions 
-    * http://google-styleguide.googlecode.com/svn/trunk/pyguide.html Google Python Style Guide
-    * http://www.voidspace.org.uk/python/articles/python_style_guide.shtml
-    * http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
-    * http://www.cs.caltech.edu/courses/cs11/material/python/misc/python_style_guide.html
-    * http://barry.warsaw.us/software/STYLEGUIDE.txt
-    * http://self.maluke.com/style
-    * http://chandlerproject.org/Projects/ChandlerCodingStyleGuidelines
-    * http://lists.osafoundation.org/pipermail/dev/2003-March/000479.html
-    * http://learnpython.pbworks.com/PythonTricks
-    * http://eikke.com/how-not-to-write-python-code/
-    * http://jaynes.colorado.edu/PythonGuidelines.html
-    * http://docs.djangoproject.com/en/dev/internals/contributing/#coding-style
-    * http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines 
+  * Must-read
+    * Official Python coding style guide: http://www.python.org/dev/peps/pep-0008
+    * Official docstring conventions: http://www.python.org/dev/peps/pep-0257
+    * Google Python Style Guide: http://google-styleguide.googlecode.com/svn/trunk/pyguide.html
+  * Interesting
+    * Code Like a Pythonista: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
+  * Can skip
+    * Python style for university class: http://www.cs.caltech.edu/courses/cs11/material/python/misc/python_style_guide.html
+    * Mailman coding style: http://barry.warsaw.us/software/STYLEGUIDE.txt
+    * Some company coding style: http://self.maluke.com/style
+    * Chandler coding style: http://chandlerproject.org/Projects/ChandlerCodingStyleGuidelines
+    * Outdated recommendations: http://lists.osafoundation.org/pipermail/dev/2003-March/000479.html
+    * Mostly some beginners tips: http://learnpython.pbworks.com/PythonTricks
+    * More beginners tips: http://eikke.com/how-not-to-write-python-code/
+    * Cogent coding guidelines: http://jaynes.colorado.edu/PythonGuidelines.html
+    * Djangoo coding guidelines: http://docs.djangoproject.com/en/dev/internals/contributing/#coding-style
+    * Numpy documentation style guidelines: http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines 
+    * Some random guy guidelines (nothing special): http://www.voidspace.org.uk/python/articles/python_style_guide.shtml
 
 We will probably want to take PEP-8 as starting point, and read what other
 people think about it / how other coding guidelines differ from it.
@@ -61,27 +134,7 @@
     or less been wiped out by HTML's convention of ignoring extra 
     whitespace: see http://en.wikipedia.org/wiki/Sentence_spacing for
     more detail. I think it's okay to drop this convention in source code.)
-
-   * Imports should usually be on separate lines
-    --> Can be a lot of lines wasted for no obvious benefit. I think this is
-        mostly useful when you import different modules from different places,
-        but I would say that for instance for standard modules it would be
-        better to import them all on a single line (doing multiple lines only
-        if there are too many of them), e.g. prefer:
-            import os, sys, time
-        to
-            import os
-            import sys
-            import time
-        However, I agree about separating imports between standard lib / 3rd
-        party, e.g. prefer:
-            import os, sys, time
-            import numpy, scipy
-        to
-            import numpy, os, scipy, sys, time
-        (Personal note: preferably order imports by alphabetical order, makes
-         it easier to quickly see if a specific module is already imported,
-         and avoids duplicated imports)
+    OD: Cool, thanks, I guess we can drop it then.
 
     * Missing in PEP 8:
         - How to indent multi-line statements? E.g. do we want
@@ -101,12 +154,6 @@
           be to go with 2 when it can fit on two lines, and 3 otherwise. Same
           with lists.
 
-    * From PEP 257: The BDFL [3] recommends inserting a blank line between the
-      last paragraph in a multi-line docstring and its closing quotes, placing
-      the closing quotes on a line by themselves. This way, Emacs'
-      fill-paragraph command can be used on it.
-     --> I have nothing against Emacs, but this is ugly!
-
 Documentation
 -------------
 
@@ -136,6 +183,8 @@
 Use RST with Sphinx.
 Task: Provide specific examples on how to document a class, method, and some
 specific classes like Op (DE). Modify the theano documentation to include that.
+OD: May want to check out
+    http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines
 
    * Python versions to be supported
 Support 2.4 (because some of the clusters are still running 2.4) and write
@@ -181,8 +230,8 @@
 
 Have a sample code that showcases everything one should comply to.
 
-Some coding guidlines (work-in-progress from OD)
-------------------------------------------------
+Some coding guidelines (work-in-progress from OD)
+-------------------------------------------------
 
    * Avoid using lists if all you care about is iterating on something. Using
      lists:
@@ -211,39 +260,6 @@
     key in my_dict              my_dict.has_key(key)
     sub_string in my_string     my_string.find(sub_string) >= 0
 
-    * (Point to debate) Avoid contractions in code comments (particularly in
-      documentation): "We do not add blue to red because it does not look
-      good" rather than "We don't add blue to red because it doesn't look
-      good". I mostly find it to be cleaner (been used to it while writing
-      scientific articles too).
-
-   * (Point to debate) Imperative vs. third-person comments. I am used to the
-     imperative form and like it better only because it typically saves one
-     letter (the 's'): "Return the sum of elements in x" rather than
-     "Returns the sum of elements in x".
-
-    * (Point to debate) I like always doing the following when subclassing
-      a class A:
-        class B(A):
-            def __init__(self, b_arg_1, b_arg_2, **kw):
-                super(B, self).__init__(**kw)
-                ...
-      The point here is that the constructor always allow for extra keyword
-      arguments (except for the class at the very top of the hierarchy), which
-      are automatically passed to the parent class.
-      Pros:
-        - You do not need to repeat the parent class arguments whenever you
-          write a new subclass.
-        - Whenever you add an argument to the parent class, all child classes
-          can benefit from it without modifying their code.
-      Cons:
-        - One needs to look at the parent classes to see what these arguments
-          are.
-        - You cannot use a **kw argument in your constructor for your own
-          selfish purpose.
-        - I have no clue whether one could do this with multiple inheritance.
-        - More?
-      Question: Should we encourage this in Pylearn?
 
    * Generally prefer list comprehensions to map / filter, as the former are
      easier to read.
@@ -272,6 +288,12 @@
 
     * Code indent must be done with four blank characters (not with tabs).
 
+    * Limit lines to 79 characters.
+
+    * Comments should start with a capital letter (unless the first word is a
+      code identifier) and end with a period (very short inline comments may
+      ignore this rule).
+
     * Whenever you read / write binary files, specify it in the mode ('rb' for
       reading, 'wb' for writing). This is important for cross-platform and
       Python 3 compatibility (e.g. when pickling / unpickling objects).
@@ -290,9 +312,22 @@
         raise MyException(args)
       where MyException inherits from Exception.
 
+    * Imports should be listed in alphabetical order. It makes it easier to
+      verify that something is imported, and avoids duplicated imports.
+
+    * Use a leading underscore '_' for internal attributes / methods,
+      but avoid the double underscore '__' unless you know what you are
+      doing.
+
+    * A script's only top-level code should be something like:
+        if __name__ == '__main__':
+            sys.exit(main())
+
 Mercurial commits
 -----------------
 
    * How to write good commit messages?
+    OD: Check Django's guidelines (link above)
    * Standardize the merge commit text (what is the message from fetch?)
 
+