Mercurial > pylearn
diff doc/v2_planning/architecture.txt @ 1098:4eda3f52ebef
v2planning - revs to requirements, added architecture
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Mon, 13 Sep 2010 09:42:36 -0400 |
parents | |
children | e5306f5626d4 |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/v2_planning/architecture.txt Mon Sep 13 09:42:36 2010 -0400 @@ -0,0 +1,60 @@ +==================== +Pylearn Architecture +==================== + + +Basic Design Approach +===================== + +I propose that the basic design of the library follow the Symbolic Expression +(SE) structure + virtual machine (VM) pattern that worked for Theano. + +So the main things for the library to provide would be: + +- a few VMs, some of which can run programs in parallel across processors, + hosts, and networks [R6,R8]; + +- MLA components as either individual Expressions (similar to Ops) or as + subgraphs of SEs [R5,R7,R10,R11] + +- machine learning algorithms including their training and testing in the form + of python functions that build SE graphs.[R1,R8]. + +This design addresses R2 (modularity) because swapping components is literally implemented by +swapping subgraphs. + +The design addresses R9 (algorithmic efficiency) because we can write +Theano-style graph transformations to recognize special cases of component +combinations. + +The design addresses R3 if we make the additional decision that the VMs (at +least sometimes) cache the return value of program function calls. This cache +serves as a database of experimental results, indexed by the functions that +originally computed them. I think this is a very natural scheme for organizing +experiment results, and ensuring experiment reproducibility [R1]. +At the same time, this is a clean and simple API behind which experiments can be +saved using a number of database technologies. + +APIs vs. lambda +---------------- + +Modularity in general is achieved when pieces can be substituted one for the +other. + +In an object-oriented design, modularity is achieved by agreeing on interface +APIs, but in a functional design there is another possibility: the lambda. + +In an SE these pieces are expression [applications] and the subgraphs they form. +A subgraph is characterized syntactically within the program by its arguments +and its return values. A lambda function allows the User to create new +Expression types from arbitrary subgraphs with very few keystrokes. When a +lambda is available and easy to use, there is much less pressure on the +expression library to follow calling and return conventions strictly. + +Of course, the closer are two subgraphs in terms of their inputs, outputs, and +semantics, the easier it is to substitute one for the other. As library +designers, we should still aim for compatibility of similar algorithms. It's +just not essential to choose an API that will guarantee a match, or indeed to +choose any explicit API at all. + +