Mercurial > pylearn
comparison doc/v2_planning/API_coding_style.txt @ 1174:fe6c25eb1e37
merge
author | pascanur |
---|---|
date | Fri, 17 Sep 2010 16:13:58 -0400 |
parents | a0f178bc9052 53340a8df1fa |
children | 67f4edabb0cc |
comparison
equal
deleted
inserted
replaced
1173:a0f178bc9052 | 1174:fe6c25eb1e37 |
---|---|
1 ========================= | 1 ========================= |
2 Coding Style Guidelines | 2 Coding Style Guidelines |
3 ========================= | 3 ========================= |
4 | |
5 Note: until the Pylearn documentation is properly compiled, you can view | |
6 the HTML version of this document `here | |
7 <http://www.iro.umontreal.ca/~delallea/tmp/coding_style/html/API_coding_style.html>`_. | |
4 | 8 |
5 Main Goals | 9 Main Goals |
6 ========== | 10 ========== |
7 | 11 |
8 * Code should be compatible with Python 2.4 and above (using 2to3 for | 12 * Code should be compatible with Python 2.4 and above (using 2to3 for |
56 | 60 |
57 Which means it has more than one line. | 61 Which means it has more than one line. |
58 | 62 |
59 """ | 63 """ |
60 | 64 |
65 * Standard library imports can (and should) be on the same line, to avoid | |
66 wasting space on straighforward imports: | |
67 | |
68 .. code-block:: python | |
69 | |
70 # Good. | |
71 import os, sys, time | |
72 # Good when it does not fit on a single line. | |
73 import std_lib_module_1, std_lib_module_2, std_lib_module_3 | |
74 import std_lib_module_4, std_lib_module_5, std_lib_module_6 | |
75 # Bad. | |
76 import os | |
77 import sys | |
78 import time | |
79 | |
80 * Importing class / functions from a module is allowed when these are | |
81 used multiple times, and no ambiguity is possible. | |
82 | |
83 .. code-block:: python | |
84 | |
85 # Good when Bar and Blah are used many times. | |
86 from foo import Bar, Blah | |
87 do_something_with(Bar(), Blah(), Bar(), Blah(), Bar(), Blah()) | |
88 # Good in most situations. | |
89 import foo | |
90 do_something_with(foo.Bar(), foo.Blah()) | |
91 # Bad. | |
92 from foo import * | |
93 from numpy import any # Potential ambiguity with __builtin__.any | |
94 | |
61 Excerpts | 95 Excerpts |
62 ~~~~~~~~ | 96 ~~~~~~~~ |
63 | 97 |
64 We emphasize here a few important topics that are found in the official | 98 We emphasize here a few important topics that are found in the official |
65 guidelines: | 99 guidelines: |
66 | 100 |
101 * Only use ASCII characters in code files. | |
102 | |
103 * Code indent must be done with four blank characters (no tabs). | |
104 | |
105 * Limit lines to 79 characters. | |
106 | |
107 * Naming conventions: ``ClassName``, ``TOP_LEVEL_CONSTANT``, | |
108 ``everything_else``. | |
109 | |
110 * Comments should start with a capital letter (unless the first word is a | |
111 code identifier) and end with a period (short inline comments may skip | |
112 the period at the end). | |
113 | |
114 * Imports should be listed in alphabetical order. It makes it easier to | |
115 verify that something is imported, and avoids duplicated imports. | |
116 | |
117 * Use absolute imports only. This is compatible across a wider range of | |
118 Python versions, and avoids confusion about what is being | |
119 imported. | |
120 | |
121 * Avoid renaming imported modules. This makes code more difficult to | |
122 re-use, and is not grep-friendly. | |
123 | |
124 .. code-block:: python | |
125 | |
126 # Good. | |
127 from theano import tensor | |
128 # Bad. | |
129 from theano import tensor as T | |
130 | |
67 * Avoid using lists if all you care about is iterating on something. Using | 131 * Avoid using lists if all you care about is iterating on something. Using |
68 lists: | 132 lists: |
69 - uses more memory (and possibly more CPU if the code may break out of | 133 |
70 the iteration), | 134 - uses more memory (and possibly more CPU if the code may break out of |
71 - can lead to ugly code when converted to Python 3 with 2to3, | 135 the iteration), |
72 - can have a different behavior if evaluating elements in the list has | 136 - can lead to ugly code when converted to Python 3 with 2to3, |
73 side effects (if you want these side effects, make it explicit by | 137 - can have a different behavior if evaluating elements in the list has |
74 assigning the list to some variable before iterating on it). | 138 side effects (if you want these side effects, make it explicit by |
139 assigning the list to some variable before iterating on it). | |
75 | 140 |
76 +------------------------+------------------------+ | 141 +------------------------+------------------------+ |
77 | Iterative version | List version | | 142 | Iterative version | List version | |
78 +========================+========================+ | 143 +========================+========================+ |
79 | .. code-block:: python | .. code-block:: python | | 144 | .. code-block:: python | .. code-block:: python | |
99 | 164 |
100 # Good. | 165 # Good. |
101 for f_x in imap(f, x): | 166 for f_x in imap(f, x): |
102 ... | 167 ... |
103 all_f_x = map(f, x) | 168 all_f_x = map(f, x) |
104 map(f, x) | 169 map(f, x) # f has some side effect. |
105 # Bad. | 170 # Bad. |
106 for element in map(f, x): | 171 for element in map(f, x): |
107 ... | 172 ... |
108 imap(f, x) | 173 imap(f, x) |
109 | 174 |
131 has_substring = substring in my_string | 196 has_substring = substring in my_string |
132 # Bad. | 197 # Bad. |
133 has_key = my_dict.has_key(key) | 198 has_key = my_dict.has_key(key) |
134 has_substring = my_string.find(substring) >= 0 | 199 has_substring = my_string.find(substring) >= 0 |
135 | 200 |
201 * Do not use mutable arguments as default values. Instead, use a helper | |
202 function (conditional expressions are forbidden at this point, see | |
203 below). | |
204 | |
205 .. code-block:: python | |
206 | |
207 # Good. | |
208 def f(array=None): | |
209 array = pylearn.if_none(array, []) | |
210 ... | |
211 # Bad. | |
212 def f(array=[]): # Dangerous if `array` is modified down the road. | |
213 ... | |
214 | |
215 * Use a leading underscore '_' in names of internal attributes / methods, | |
216 but avoid the double underscore '__' unless you know what you are | |
217 doing. | |
218 | |
136 | 219 |
137 Additional Recommendations | 220 Additional Recommendations |
138 -------------------------- | 221 -------------------------- |
139 | 222 |
140 Things you should do even if they are not listed in official guidelines: | 223 Things you should do even if they are not listed in official guidelines: |
224 | |
225 * All Python code files should start like this: | |
226 | |
227 .. code-block:: python | |
228 | |
229 """Module docstring as the first line, as usual.""" | |
230 | |
231 __authors__ = "Olivier Delalleau, Frederic Bastien, David Warde-Farley" | |
232 __copyright__ = "(c) 2010, Universite de Montreal" | |
233 __license__ = "3-clause BSD License" | |
234 __contact__ = "Name Of Current Guardian of this file <email@address>" | |
235 | |
236 * Use ``//`` for integer division and ``/ float(...)`` if you want the | |
237 floating point operation (for readability and compatibility across all | |
238 versions of Python). | |
239 | |
240 .. code-block:: python | |
241 | |
242 # Good. | |
243 n_samples_per_split = n_samples // n_splits | |
244 mean_x = sum(x) / float(len(x)) | |
245 # Bad. | |
246 n_samples_per_split = n_samples / n_splits | |
247 mean_x = sum(x) / len(x) | |
248 | |
249 * Always raise an exception with ``raise MyException(args)`` where ``MyException`` | |
250 inherits from ``Exception``. This is required for compatibility across | |
251 all versions of Python. | |
252 | |
253 .. code-block:: python | |
254 | |
255 # Good. | |
256 raise NotImplementedError('The Pylearn team is too lazy.') | |
257 # Bad. | |
258 raise NotImplementedError, 'The Pylearn team is too lazy.' | |
259 raise 'The Pylearn team is too lazy to implement this.' | |
260 | |
261 * Use either ``try ... except`` or ``try ... finally``, but do not mix | |
262 ``except`` with ``finally`` (which is not supported in Python 2.4). | |
263 You can however embed one into the other to mimic the ``try ... except ... | |
264 finally`` behavior. | |
265 | |
266 .. code-block:: python | |
267 | |
268 # Good. | |
269 try: | |
270 try: | |
271 something_that_may_fail() | |
272 except SomeError: | |
273 do_something_if_it_failed() | |
274 finally: | |
275 always_do_this_regardless_of_what_happened() | |
276 # Bad. | |
277 try: | |
278 something_that_may_fail() | |
279 except SomeError: | |
280 do_something_if_it_failed() | |
281 finally: | |
282 always_do_this_regardless_of_what_happened() | |
283 | |
284 * No conditional expression (not supported in Python 2.4). These are | |
285 expressions of the form ``x = y if condition else z``. | |
286 | |
287 * Do not use the ``all`` and ``any`` builtin functions (they are not supported | |
288 in Python 2.4). Instead, import them from ``theano.gof.python25`` (or | |
289 use ``numpy.all`` / ``numpy.any`` for array data). | |
290 | |
291 * Do not use the ``hashlib`` module (not supported in Python 2.4). We will | |
292 probably provide a wrapper around it to be compatible with all Python | |
293 versions. | |
294 | |
295 * Use ``numpy.inf`` and ``numpy.nan`` rather than | |
296 ``float('inf')`` / ``float('nan')`` (should be slightly more efficient even | |
297 if efficiency is typically not an issue here, the main goal being code | |
298 consistency). Also, always use ``numpy.isinf`` / ``numpy.isnan`` to | |
299 test infinite / NaN values. This is important because ``numpy.nan != | |
300 float('nan')``. | |
141 | 301 |
142 * Avoid backslashes whenever possible. They make it more | 302 * Avoid backslashes whenever possible. They make it more |
143 difficult to edit code, and they are ugly (as well as potentially | 303 difficult to edit code, and they are ugly (as well as potentially |
144 dangerous if there are trailing white spaces). | 304 dangerous if there are trailing white spaces). |
145 | 305 |
193 my_bar, | 353 my_bar, |
194 my_love, | 354 my_love, |
195 my_everything]: | 355 my_everything]: |
196 ... | 356 ... |
197 | 357 |
358 * Use the ``key`` argument instead of ``cmp`` when sorting (for Python 3 | |
359 compatibility). | |
360 | |
361 .. code-block:: python | |
362 | |
363 # Good. | |
364 my_list.sort(key=abs) | |
365 # Bad. | |
366 my_list.sort(cmp=lambda x, y: cmp(abs(x), abs(y))) | |
367 | |
368 * Whenever you read / write binary files, specify it in the mode ('rb' for | |
369 reading, 'wb' for writing). This is important for cross-platform and | |
370 Python 3 compatibility (e.g. when pickling / unpickling objects). | |
371 | |
372 .. code-block:: python | |
373 | |
374 # Good. | |
375 cPickle.dump(obj, open('my_obj.pkl', 'wb', protocol=-1)) | |
376 # Bad. | |
377 cPickle.dump(obj, open('my_obj.pkl', 'w', protocol=-1)) | |
378 | |
379 * Avoid tuple parameter unpacking as it can lead to very ugly code when | |
380 converting to Python 3. | |
381 | |
382 .. code-block:: python | |
383 | |
384 # Good. | |
385 def f(x, y_z): | |
386 y, z = y_z | |
387 ... | |
388 # Bad. | |
389 def f(x, (y, z)): | |
390 ... | |
391 | |
392 * Only use ``cPickle``, not ``pickle`` (except for debugging purpose since | |
393 error messages from ``pickle`` are sometimes easier to understand). | |
394 | |
395 * A script's only top-level code should be something like: | |
396 | |
397 .. code-block:: python | |
398 | |
399 if __name__ == '__main__': | |
400 sys.exit(main()) | |
401 | |
198 | 402 |
199 The ``logging`` Module vs. the ``warning`` Module | 403 The ``logging`` Module vs. the ``warning`` Module |
200 ================================================= | 404 ================================================= |
201 | 405 |
202 The ``logging`` Module | 406 The ``logging`` Module |
244 referring to Pylearn internals. | 448 referring to Pylearn internals. |
245 | 449 |
246 Code Sample | 450 Code Sample |
247 =========== | 451 =========== |
248 | 452 |
249 The following code sample illustrates many of the coding guidelines one should | 453 The following code sample illustrates some of the coding guidelines one should |
250 follow in Pylearn. | 454 follow in Pylearn. This is still a work-in-progress. |
251 | 455 |
252 .. code-block:: python | 456 .. code-block:: python |
253 | 457 |
458 #! /usr/env/bin python | |
459 | |
460 """Sample code. There may still be mistakes / missing elements.""" | |
461 | |
462 __authors__ = "Olivier Delalleau" | |
463 __copyright__ = "(c) 2010, Universite de Montreal" | |
464 __license__ = "3-clause BSD License" | |
465 __contact__ = "Olivier Delalleau <delallea@iro>" | |
466 | |
467 # Standard library imports are on a single line. | |
254 import os, sys, time | 468 import os, sys, time |
255 | 469 |
470 # Third-party imports come after standard library imports, and there is | |
471 # only one import per line. Imports are sorted lexicographically. | |
472 import numpy | |
473 import scipy | |
474 import theano | |
475 # Put 'from' imports below. | |
476 from numpy import argmax | |
477 from theano import tensor | |
478 | |
479 # Application-specific imports come last. | |
480 from pylearn import dataset | |
481 from pylearn.optimization import minimize | |
482 | |
483 def print_files_in(directory): | |
484 """Print the first line of each file in given directory.""" | |
485 # TODO To be continued... | |
486 | |
487 def main(): | |
488 if len(sys.argv) != 2: | |
489 # Note: conventions on how to display script documentation and | |
490 # parse arguments are still to-be-determined. | |
491 print("""\ | |
492 Usage: %s <directory> | |
493 Print first line of each file in given directory (in alphabetic order).""" | |
494 % os.path.basename(sys.argv[0])) | |
495 return 1 | |
496 print_files_in(sys.argv[1]) | |
497 return 0 | |
498 | |
499 # Top-level executable code should be minimal. | |
500 if __name__ == '__main__': | |
501 sys.exit(main()) | |
502 | |
503 | |
504 Automatic Code Verification | |
505 =========================== | |
506 | |
507 Tools will be available to make it easier to automatically ensure that code | |
508 committed to Pylearn complies to above specifications. This work is not | |
509 finalized yet, but David started a `Wiki page`_ with helpful configuration | |
510 tips for Vim. | |
511 | |
512 .. _Wiki page: http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Divers/VimPythonRecommendations | |
513 | |
514 TODO | |
515 ==== | |
516 | |
517 Things still missing from this document, being discussed in coding_style.txt: | |
518 - Proper style for C code and Mercurial commits | |
519 - Enforcing 100% test coverage of the code base | |
520 - Providing ways to add type checking for function arguments | |
521 - Conventions for script usage documentation and argument parsing | |
522 |