comparison doc/v2_planning/API_coding_style.txt @ 1331:0541e7d6e916

merge
author gdesjardins
date Thu, 14 Oct 2010 23:55:55 -0400
parents 4efa2630f430
children 90116fb3636b
comparison
equal deleted inserted replaced
1330:3efd0effb2a7 1331:0541e7d6e916
257 257
258 .. code-block:: python 258 .. code-block:: python
259 259
260 """Module docstring as the first line, as usual.""" 260 """Module docstring as the first line, as usual."""
261 261
262 __authors__ = "Olivier Delalleau, Frederic Bastien, David Warde-Farley" 262 __authors__ = "Olivier Delalleau, Frederic Bastien, David Warde-Farley"
263 __copyright__ = "(c) 2010, Universite de Montreal" 263 __copyright__ = "(c) 2010, Universite de Montreal"
264 __license__ = "3-clause BSD License" 264 __license__ = "3-clause BSD License"
265 __contact__ = "Name Of Current Guardian of this file <email@address>" 265 __contact__ = "Name Of Current Guardian of this file <email@address>"
266 266
267 * Use ``//`` for integer division and ``/ float(...)`` if you want the 267 * Use ``//`` for integer division and ``/ float(...)`` if you want the
268 floating point operation (for readability and compatibility across all 268 floating point operation (for readability and compatibility across all
269 versions of Python). 269 versions of Python).
270 270
316 if efficiency is typically not an issue here, the main goal being code 316 if efficiency is typically not an issue here, the main goal being code
317 consistency). Also, always use ``numpy.isinf`` / ``numpy.isnan`` to 317 consistency). Also, always use ``numpy.isinf`` / ``numpy.isnan`` to
318 test infinite / NaN values. This is important because ``numpy.nan != 318 test infinite / NaN values. This is important because ``numpy.nan !=
319 float('nan')``. 319 float('nan')``.
320 320
321 * Whenever possible, mimic the numpy / scipy interfaces when writing code
322 similar to what can be found in these packages.
323
321 * Avoid backslashes whenever possible. They make it more 324 * Avoid backslashes whenever possible. They make it more
322 difficult to edit code, and they are ugly (as well as potentially 325 difficult to edit code, and they are ugly (as well as potentially
323 dangerous if there are trailing white spaces). 326 dangerous if there are trailing white spaces).
324 327
325 .. code-block:: python 328 .. code-block:: python
343 * When indenting multi-line statements like lists or function arguments, 346 * When indenting multi-line statements like lists or function arguments,
344 keep elements of the same level aligned with each other. 347 keep elements of the same level aligned with each other.
345 The position of the first 348 The position of the first
346 element (on the same line or a new line) should be chosen depending on 349 element (on the same line or a new line) should be chosen depending on
347 what is easiest to read (sometimes both can be ok). 350 what is easiest to read (sometimes both can be ok).
351 Other formattings may be ok depending on the specific situation, use
352 common sense and pick whichever looks best.
348 353
349 .. code-block:: python 354 .. code-block:: python
350 355
351 # Good. 356 # Good.
352 for my_very_long_variable_name in [my_foo, my_bar, my_love, 357 for my_very_long_variable_name in [my_foo, my_bar, my_love,
472 477
473 Code Sample 478 Code Sample
474 =========== 479 ===========
475 480
476 The following code sample illustrates some of the coding guidelines one should 481 The following code sample illustrates some of the coding guidelines one should
477 follow in Pylearn. This is still a work-in-progress. 482 follow in Pylearn. This is still a work-in-progress. Feel free to improve it and
483 add more!
478 484
479 .. code-block:: python 485 .. code-block:: python
480 486
481 #! /usr/env/bin python 487 #! /usr/env/bin python
482 488
483 """Sample code. There may still be mistakes / missing elements.""" 489 """Sample code. Edit it as you like!"""
484 490
485 __authors__ = "Olivier Delalleau" 491 __authors__ = "Olivier Delalleau"
486 __copyright__ = "(c) 2010, Universite de Montreal" 492 __copyright__ = "(c) 2010, Universite de Montreal"
487 __license__ = "3-clause BSD License" 493 __license__ = "3-clause BSD License"
488 __contact__ = "Olivier Delalleau <delallea@iro>" 494 __contact__ = "Olivier Delalleau <delallea@iro>"
489 495
490 # Standard library imports are on a single line. 496 # Standard library imports are on a single line.
491 import os, sys, time 497 import os, sys, time
492 498
493 # Third-party imports come after standard library imports, and there is 499 # Third-party imports come after standard library imports, and there is
494 # only one import per line. Imports are sorted lexicographically. 500 # only one import per line. Imports are sorted lexicographically.
495 import numpy 501 import numpy
496 import scipy 502 import scipy
497 import theano 503 import theano
498 # Put 'from' imports below. 504 # Individual 'from' imports come after packages.
499 from numpy import argmax 505 from numpy import argmax
500 from theano import tensor 506 from theano import tensor
501 507
502 # Application-specific imports come last. 508 # Application-specific imports come last.
503 from pylearn import dataset 509 # The absolute path should always be used.
504 from pylearn.optimization import minimize 510 from pylearn import datasets, learner
505 511 from pylearn.formulas import noise
506 def print_files_in(directory): 512
507 """Print the first line of each file in given directory.""" 513
508 # TODO To be continued... 514 # All exceptions inherit from Exception.
515 class PylearnError(Exception):
516 # TODO Write doc.
517 pass
518
519 # All top-level classes inherit from object.
520 class StorageExample(object):
521 # TODO Write doc.
522 pass
523
524
525 # Two blank lines between definitions of top-level classes and functions.
526 class AwesomeLearner(learner.Learner):
527 # TODO Write doc.
528
529 def __init__(self, print_fields=None):
530 # TODO Write doc.
531 # print_fields is a list of strings whose counts found in the
532 # training set should be printed at the end of training. If None,
533 # then nothing is printed.
534 # Do not forget to call the parent class constructor.
535 super(AwesomeLearner, self).__init__()
536 # Use None instead of an empty list as default argument to
537 # print_fields to avoid issues with mutable default arguments.
538 self.print_fields = if_none(print_fields, [])
539
540 # One blank line between method definitions.
541 def add_field(self, field):
542 # TODO Write doc.
543 # Test if something belongs to a container with `in`, not
544 # container-specific methods like `index`.
545 if field in self.print_fields:
546 # TODO Print a warning and do nothing.
547 pass
548 else:
549 # This is why using [] as default to print_fields in the
550 # constructor would have been a bad idea.
551 self.print_fields.append(field)
552
553 def train(self, dataset):
554 # TODO Write doc (store the mean of each field in the training
555 # set).
556 self.mean_fields = {}
557 count = {}
558 for sample_dict in dataset:
559 # Whenever it is enough for what you need, use iterative
560 # instead of list versions of dictionary methods.
561 for field, value in sample_dict.iteritems():
562 # Keep line length to max 80 characters, using parentheses
563 # instead of \ to continue long lines.
564 self.mean_fields[field] = (self.mean_fields.get(field, 0) +
565 value)
566 count[field] = count.get(field, 0) + 1
567 for field in self.mean_fields:
568 self.mean_fields[field] /= float(count[field])
569 for field in self.print_fields:
570 # Test is done with `in`, not `has_key`.
571 if field in self.sum_fields:
572 # TODO Use log module instead.
573 print '%s: %s' % (field, self.sum_fields[field])
574 else:
575 # TODO Print warning.
576 pass
577
578 def test_error(self, dataset):
579 # TODO Write doc.
580 if not hasattr(self, 'sum_fields'):
581 # Exceptions should be raised as follows (in particular, no
582 # string exceptions!).
583 raise PylearnError('Cannot test a learner that was not '
584 'trained.')
585 error = 0
586 count = 0
587 for sample_dict in dataset:
588 for field, value in sample_dict.iteritems():
589 try:
590 # Minimize code into a try statement.
591 mean = self.mean_fields[field]
592 # Always specicy which kind of exception you are
593 # intercepting with except.
594 except KeyError:
595 raise PylearnError(
596 "Found in a test sample a field ('%s') that had "
597 "never been seen in the training set." % field)
598 error += (value - self.mean_fields[field])**2
599 count += 1
600 # Remember to divide by a floating point number unless you
601 # explicitly want an integer division (in which case you should
602 # use //).
603 mse = error / float(count)
604 # TODO Use log module instead.
605 print 'MSE: %s' % mse
606 return mse
607
608
609 def if_none(val_if_not_none, val_if_none):
610 # TODO Write doc.
611 if val_if_not_none is not None:
612 return val_if_not_none
613 else:
614 return val_if_none
615
616
617 def print_subdirs_in(directory):
618 # TODO Write doc.
619 # Using list comprehension rather than filter.
620 sub_dirs = sorted([d for d in os.listdir(directory)
621 if os.path.isdir(os.path.join(directory, d))])
622 print '%s: %s' % (directory, ' '.join(sub_dirs))
623 # A `for` loop is often easier to read than a call to `map`.
624 for d in sub_dirs:
625 print_subdirs_in(os.path.join(directory, d))
626
509 627
510 def main(): 628 def main():
511 if len(sys.argv) != 2: 629 if len(sys.argv) != 2:
512 # Note: conventions on how to display script documentation and 630 # Note: conventions on how to display script documentation and
513 # parse arguments are still to-be-determined. 631 # parse arguments are still to-be-determined. This is just one
632 # way to do it.
514 print("""\ 633 print("""\
515 Usage: %s <directory> 634 Usage: %s <directory>
516 Print first line of each file in given directory (in alphabetic order).""" 635 For the given directory and all sub-directories found inside it, print
636 the list of the directories they contain."""
517 % os.path.basename(sys.argv[0])) 637 % os.path.basename(sys.argv[0]))
518 return 1 638 return 1
519 print_files_in(sys.argv[1]) 639 print_subdirs_in(sys.argv[1])
520 return 0 640 return 0
641
521 642
522 # Top-level executable code should be minimal. 643 # Top-level executable code should be minimal.
523 if __name__ == '__main__': 644 if __name__ == '__main__':
524 sys.exit(main()) 645 sys.exit(main())
525 646
531 committed to Pylearn complies to above specifications. This work is not 652 committed to Pylearn complies to above specifications. This work is not
532 finalized yet, but David started a `Wiki page`_ with helpful configuration 653 finalized yet, but David started a `Wiki page`_ with helpful configuration
533 tips for Vim. 654 tips for Vim.
534 655
535 .. _Wiki page: http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Divers/VimPythonRecommendations 656 .. _Wiki page: http://www.iro.umontreal.ca/~lisa/twiki/bin/view.cgi/Divers/VimPythonRecommendations
657
658 Commit message
659 ==============
660
661 * A one line summary. Try to keep it short, and provide the information
662 that seems most useful to other developers: in particular the goal of
663 a change is more useful than its description (which is always
664 available through the changeset patch log). E.g. say "Improved stability
665 of cost computation" rather than "Replaced log(exp(a) + exp(b)) by
666 a * log(1 + exp(b -a)) in cost computation".
667 * If needed a blank line followed by a more detailed summary
668 * Make a commit for each logical modification
669 * This makes reviews easier to do
670 * This makes debugging easier as we can more easily pinpoint errors in
671 commits with hg bisect
672 * NEVER commit reformatting with functionality changes
673 * Review your change before commiting
674 * "hg diff <files>..." to see the diff you have done
675 * "hg record" allows you to select which changes to a file should be
676 committed. To enable it, put into the file ~/.hgrc:
677
678 .. code-block:: bash
679
680 [extensions]
681 hgext.record=
682
683 * hg record / diff force you to review your code, never commit without
684 running one of these two commands first
685 * Write detailed commit messages in the past tense, not present tense.
686 * Good: "Fixed Unicode bug in RSS API."
687 * Bad: "Fixes Unicode bug in RSS API."
688 * Bad: "Fixing Unicode bug in RSS API."
689 * Separate bug fixes from feature changes.
690 * When fixing a ticket, start the message with "Fixed #abc"
691 * Can make a system to change the ticket?
692 * When referencing a ticket, start the message with "Refs #abc"
693 * Can make a system to put a comment to the ticket?
694
536 695
537 TODO 696 TODO
538 ==== 697 ====
539 698
540 Things still missing from this document, being discussed in coding_style.txt: 699 Things still missing from this document, being discussed in coding_style.txt: