diff doc/v2_planning/architecture.txt @ 1098:4eda3f52ebef

v2planning - revs to requirements, added architecture
author James Bergstra <bergstrj@iro.umontreal.ca>
date Mon, 13 Sep 2010 09:42:36 -0400
parents
children e5306f5626d4
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/v2_planning/architecture.txt	Mon Sep 13 09:42:36 2010 -0400
@@ -0,0 +1,60 @@
+====================
+Pylearn Architecture
+====================
+
+
+Basic Design Approach
+=====================
+
+I propose that the basic design of the library follow the Symbolic Expression
+(SE) structure + virtual machine (VM) pattern that worked for Theano.
+
+So the main things for the library to provide would be:
+
+- a few VMs, some of which can run programs in parallel across processors,
+  hosts, and networks [R6,R8];
+
+- MLA components as either individual Expressions (similar to Ops) or as
+  subgraphs of SEs [R5,R7,R10,R11]
+
+- machine learning algorithms including their training and testing in the form
+  of python functions that build SE graphs.[R1,R8].
+
+This design addresses R2 (modularity) because swapping components is literally implemented by
+swapping subgraphs.
+
+The design addresses R9 (algorithmic efficiency) because we can write
+Theano-style graph transformations to recognize special cases of component
+combinations.
+
+The design addresses R3 if we make the additional decision that the VMs (at
+least sometimes) cache the return value of program function calls.  This cache
+serves as a database of experimental results, indexed by the functions that
+originally computed them.  I think this is a very natural scheme for organizing
+experiment results, and ensuring experiment reproducibility [R1].
+At the same time, this is a clean and simple API behind which experiments can be
+saved using a number of database technologies.
+
+APIs vs. lambda
+----------------
+
+Modularity in general is achieved when pieces can be substituted one for the
+other.
+
+In an object-oriented design, modularity is achieved by agreeing on interface
+APIs, but in a functional design there is another possibility: the lambda.
+
+In an SE these pieces are expression [applications] and the subgraphs they form.
+A subgraph is characterized syntactically within the program by its arguments
+and its return values.  A lambda function allows the User to create new
+Expression types from arbitrary subgraphs with very few keystrokes.  When a
+lambda is available and easy to use, there is much less pressure on the
+expression library to follow calling and return conventions strictly.
+
+Of course, the closer are two subgraphs in terms of their inputs, outputs, and
+semantics, the easier it is to substitute one for the other.  As library
+designers, we should still aim for compatibility of similar algorithms.  It's
+just not essential to choose an API that will guarantee a match, or indeed to
+choose any explicit API at all.
+
+