comparison doc/v2_planning/API_coding_style.txt @ 1174:fe6c25eb1e37

merge
author pascanur
date Fri, 17 Sep 2010 16:13:58 -0400
parents a0f178bc9052 53340a8df1fa
children 67f4edabb0cc
comparison
equal deleted inserted replaced
1173:a0f178bc9052 1174:fe6c25eb1e37
1 ========================= 1 =========================
2 Coding Style Guidelines 2 Coding Style Guidelines
3 ========================= 3 =========================
4
5 Note: until the Pylearn documentation is properly compiled, you can view
6 the HTML version of this document `here
7 <http://www.iro.umontreal.ca/~delallea/tmp/coding_style/html/API_coding_style.html>`_.
4 8
5 Main Goals 9 Main Goals
6 ========== 10 ==========
7 11
8 * Code should be compatible with Python 2.4 and above (using 2to3 for 12 * Code should be compatible with Python 2.4 and above (using 2to3 for
56 60
57 Which means it has more than one line. 61 Which means it has more than one line.
58 62
59 """ 63 """
60 64
65 * Standard library imports can (and should) be on the same line, to avoid
66 wasting space on straighforward imports:
67
68 .. code-block:: python
69
70 # Good.
71 import os, sys, time
72 # Good when it does not fit on a single line.
73 import std_lib_module_1, std_lib_module_2, std_lib_module_3
74 import std_lib_module_4, std_lib_module_5, std_lib_module_6
75 # Bad.
76 import os
77 import sys
78 import time
79
80 * Importing class / functions from a module is allowed when these are
81 used multiple times, and no ambiguity is possible.
82
83 .. code-block:: python
84
85 # Good when Bar and Blah are used many times.
86 from foo import Bar, Blah
87 do_something_with(Bar(), Blah(), Bar(), Blah(), Bar(), Blah())
88 # Good in most situations.
89 import foo
90 do_something_with(foo.Bar(), foo.Blah())
91 # Bad.
92 from foo import *
93 from numpy import any # Potential ambiguity with __builtin__.any
94
61 Excerpts 95 Excerpts
62 ~~~~~~~~ 96 ~~~~~~~~
63 97
64 We emphasize here a few important topics that are found in the official 98 We emphasize here a few important topics that are found in the official
65 guidelines: 99 guidelines:
66 100
101 * Only use ASCII characters in code files.
102
103 * Code indent must be done with four blank characters (no tabs).
104
105 * Limit lines to 79 characters.
106
107 * Naming conventions: ``ClassName``, ``TOP_LEVEL_CONSTANT``,
108 ``everything_else``.
109
110 * Comments should start with a capital letter (unless the first word is a
111 code identifier) and end with a period (short inline comments may skip
112 the period at the end).
113
114 * Imports should be listed in alphabetical order. It makes it easier to
115 verify that something is imported, and avoids duplicated imports.
116
117 * Use absolute imports only. This is compatible across a wider range of
118 Python versions, and avoids confusion about what is being
119 imported.
120
121 * Avoid renaming imported modules. This makes code more difficult to
122 re-use, and is not grep-friendly.
123
124 .. code-block:: python
125
126 # Good.
127 from theano import tensor
128 # Bad.
129 from theano import tensor as T
130
67 * Avoid using lists if all you care about is iterating on something. Using 131 * Avoid using lists if all you care about is iterating on something. Using
68 lists: 132 lists:
69 - uses more memory (and possibly more CPU if the code may break out of 133
70 the iteration), 134 - uses more memory (and possibly more CPU if the code may break out of
71 - can lead to ugly code when converted to Python 3 with 2to3, 135 the iteration),
72 - can have a different behavior if evaluating elements in the list has 136 - can lead to ugly code when converted to Python 3 with 2to3,
73 side effects (if you want these side effects, make it explicit by 137 - can have a different behavior if evaluating elements in the list has
74 assigning the list to some variable before iterating on it). 138 side effects (if you want these side effects, make it explicit by
139 assigning the list to some variable before iterating on it).
75 140
76 +------------------------+------------------------+ 141 +------------------------+------------------------+
77 | Iterative version | List version | 142 | Iterative version | List version |
78 +========================+========================+ 143 +========================+========================+
79 | .. code-block:: python | .. code-block:: python | 144 | .. code-block:: python | .. code-block:: python |
99 164
100 # Good. 165 # Good.
101 for f_x in imap(f, x): 166 for f_x in imap(f, x):
102 ... 167 ...
103 all_f_x = map(f, x) 168 all_f_x = map(f, x)
104 map(f, x) 169 map(f, x) # f has some side effect.
105 # Bad. 170 # Bad.
106 for element in map(f, x): 171 for element in map(f, x):
107 ... 172 ...
108 imap(f, x) 173 imap(f, x)
109 174
131 has_substring = substring in my_string 196 has_substring = substring in my_string
132 # Bad. 197 # Bad.
133 has_key = my_dict.has_key(key) 198 has_key = my_dict.has_key(key)
134 has_substring = my_string.find(substring) >= 0 199 has_substring = my_string.find(substring) >= 0
135 200
201 * Do not use mutable arguments as default values. Instead, use a helper
202 function (conditional expressions are forbidden at this point, see
203 below).
204
205 .. code-block:: python
206
207 # Good.
208 def f(array=None):
209 array = pylearn.if_none(array, [])
210 ...
211 # Bad.
212 def f(array=[]): # Dangerous if `array` is modified down the road.
213 ...
214
215 * Use a leading underscore '_' in names of internal attributes / methods,
216 but avoid the double underscore '__' unless you know what you are
217 doing.
218
136 219
137 Additional Recommendations 220 Additional Recommendations
138 -------------------------- 221 --------------------------
139 222
140 Things you should do even if they are not listed in official guidelines: 223 Things you should do even if they are not listed in official guidelines:
224
225 * All Python code files should start like this:
226
227 .. code-block:: python
228
229 """Module docstring as the first line, as usual."""
230
231 __authors__ = "Olivier Delalleau, Frederic Bastien, David Warde-Farley"
232 __copyright__ = "(c) 2010, Universite de Montreal"
233 __license__ = "3-clause BSD License"
234 __contact__ = "Name Of Current Guardian of this file <email@address>"
235
236 * Use ``//`` for integer division and ``/ float(...)`` if you want the
237 floating point operation (for readability and compatibility across all
238 versions of Python).
239
240 .. code-block:: python
241
242 # Good.
243 n_samples_per_split = n_samples // n_splits
244 mean_x = sum(x) / float(len(x))
245 # Bad.
246 n_samples_per_split = n_samples / n_splits
247 mean_x = sum(x) / len(x)
248
249 * Always raise an exception with ``raise MyException(args)`` where ``MyException``
250 inherits from ``Exception``. This is required for compatibility across
251 all versions of Python.
252
253 .. code-block:: python
254
255 # Good.
256 raise NotImplementedError('The Pylearn team is too lazy.')
257 # Bad.
258 raise NotImplementedError, 'The Pylearn team is too lazy.'
259 raise 'The Pylearn team is too lazy to implement this.'
260
261 * Use either ``try ... except`` or ``try ... finally``, but do not mix
262 ``except`` with ``finally`` (which is not supported in Python 2.4).
263 You can however embed one into the other to mimic the ``try ... except ...
264 finally`` behavior.
265
266 .. code-block:: python
267
268 # Good.
269 try:
270 try:
271 something_that_may_fail()
272 except SomeError:
273 do_something_if_it_failed()
274 finally:
275 always_do_this_regardless_of_what_happened()
276 # Bad.
277 try:
278 something_that_may_fail()
279 except SomeError:
280 do_something_if_it_failed()
281 finally:
282 always_do_this_regardless_of_what_happened()
283
284 * No conditional expression (not supported in Python 2.4). These are
285 expressions of the form ``x = y if condition else z``.
286
287 * Do not use the ``all`` and ``any`` builtin functions (they are not supported
288 in Python 2.4). Instead, import them from ``theano.gof.python25`` (or
289 use ``numpy.all`` / ``numpy.any`` for array data).
290
291 * Do not use the ``hashlib`` module (not supported in Python 2.4). We will
292 probably provide a wrapper around it to be compatible with all Python
293 versions.
294
295 * Use ``numpy.inf`` and ``numpy.nan`` rather than
296 ``float('inf')`` / ``float('nan')`` (should be slightly more efficient even
297 if efficiency is typically not an issue here, the main goal being code
298 consistency). Also, always use ``numpy.isinf`` / ``numpy.isnan`` to
299 test infinite / NaN values. This is important because ``numpy.nan !=
300 float('nan')``.
141 301
142 * Avoid backslashes whenever possible. They make it more 302 * Avoid backslashes whenever possible. They make it more
143 difficult to edit code, and they are ugly (as well as potentially 303 difficult to edit code, and they are ugly (as well as potentially
144 dangerous if there are trailing white spaces). 304 dangerous if there are trailing white spaces).
145 305
193 my_bar, 353 my_bar,
194 my_love, 354 my_love,
195 my_everything]: 355 my_everything]:
196 ... 356 ...
197 357
358 * Use the ``key`` argument instead of ``cmp`` when sorting (for Python 3
359 compatibility).
360
361 .. code-block:: python
362
363 # Good.
364 my_list.sort(key=abs)
365 # Bad.
366 my_list.sort(cmp=lambda x, y: cmp(abs(x), abs(y)))
367
368 * Whenever you read / write binary files, specify it in the mode ('rb' for
369 reading, 'wb' for writing). This is important for cross-platform and
370 Python 3 compatibility (e.g. when pickling / unpickling objects).
371
372 .. code-block:: python
373
374 # Good.
375 cPickle.dump(obj, open('my_obj.pkl', 'wb', protocol=-1))
376 # Bad.
377 cPickle.dump(obj, open('my_obj.pkl', 'w', protocol=-1))
378
379 * Avoid tuple parameter unpacking as it can lead to very ugly code when
380 converting to Python 3.
381
382 .. code-block:: python
383
384 # Good.
385 def f(x, y_z):
386 y, z = y_z
387 ...
388 # Bad.
389 def f(x, (y, z)):
390 ...
391
392 * Only use ``cPickle``, not ``pickle`` (except for debugging purpose since
393 error messages from ``pickle`` are sometimes easier to understand).
394
395 * A script's only top-level code should be something like:
396
397 .. code-block:: python
398
399 if __name__ == '__main__':
400 sys.exit(main())
401
198 402
199 The ``logging`` Module vs. the ``warning`` Module 403 The ``logging`` Module vs. the ``warning`` Module
200 ================================================= 404 =================================================
201 405
202 The ``logging`` Module 406 The ``logging`` Module
244 referring to Pylearn internals. 448 referring to Pylearn internals.
245 449
246 Code Sample 450 Code Sample
247 =========== 451 ===========
248 452
249 The following code sample illustrates many of the coding guidelines one should 453 The following code sample illustrates some of the coding guidelines one should
250 follow in Pylearn. 454 follow in Pylearn. This is still a work-in-progress.
251 455
252 .. code-block:: python 456 .. code-block:: python
253 457
458 #! /usr/env/bin python
459
460 """Sample code. There may still be mistakes / missing elements."""
461
462 __authors__ = "Olivier Delalleau"
463 __copyright__ = "(c) 2010, Universite de Montreal"
464 __license__ = "3-clause BSD License"
465 __contact__ = "Olivier Delalleau <delallea@iro>"
466
467 # Standard library imports are on a single line.
254 import os, sys, time 468 import os, sys, time
255 469
470 # Third-party imports come after standard library imports, and there is
471 # only one import per line. Imports are sorted lexicographically.
472 import numpy
473 import scipy
474 import theano
475 # Put 'from' imports below.
476 from numpy import argmax
477 from theano import tensor
478
479 # Application-specific imports come last.
480 from pylearn import dataset
481 from pylearn.optimization import minimize
482
483 def print_files_in(directory):
484 """Print the first line of each file in given directory."""
485 # TODO To be continued...
486
487 def main():
488 if len(sys.argv) != 2:
489 # Note: conventions on how to display script documentation and
490 # parse arguments are still to-be-determined.
491 print("""\
492 Usage: %s <directory>
493 Print first line of each file in given directory (in alphabetic order)."""
494 % os.path.basename(sys.argv[0]))
495 return 1
496 print_files_in(sys.argv[1])
497 return 0
498
499 # Top-level executable code should be minimal.
500 if __name__ == '__main__':
501 sys.exit(main())
502
503
504 Automatic Code Verification
505 ===========================
506
507 Tools will be available to make it easier to automatically ensure that code
508 committed to Pylearn complies to above specifications. This work is not
509 finalized yet, but David started a `Wiki page`_ with helpful configuration
510 tips for Vim.
511
512 .. _Wiki page: http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Divers/VimPythonRecommendations
513
514 TODO
515 ====
516
517 Things still missing from this document, being discussed in coding_style.txt:
518 - Proper style for C code and Mercurial commits
519 - Enforcing 100% test coverage of the code base
520 - Providing ways to add type checking for function arguments
521 - Conventions for script usage documentation and argument parsing
522