annotate doc/v2_planning/architecture.txt @ 1104:5e6d7d9e803a

a comment on the GPU issue for datasets
author Razvan Pascanu <r.pascanu@gmail.com>
date Mon, 13 Sep 2010 20:21:23 -0400
parents 4eda3f52ebef
children e5306f5626d4
rev   line source
1098
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
1 ====================
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
2 Pylearn Architecture
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
3 ====================
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
4
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
5
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
6 Basic Design Approach
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
7 =====================
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
8
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
9 I propose that the basic design of the library follow the Symbolic Expression
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
10 (SE) structure + virtual machine (VM) pattern that worked for Theano.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
11
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
12 So the main things for the library to provide would be:
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
13
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
14 - a few VMs, some of which can run programs in parallel across processors,
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
15 hosts, and networks [R6,R8];
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
16
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
17 - MLA components as either individual Expressions (similar to Ops) or as
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
18 subgraphs of SEs [R5,R7,R10,R11]
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
19
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
20 - machine learning algorithms including their training and testing in the form
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
21 of python functions that build SE graphs.[R1,R8].
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
22
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
23 This design addresses R2 (modularity) because swapping components is literally implemented by
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
24 swapping subgraphs.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
25
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
26 The design addresses R9 (algorithmic efficiency) because we can write
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
27 Theano-style graph transformations to recognize special cases of component
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
28 combinations.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
29
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
30 The design addresses R3 if we make the additional decision that the VMs (at
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
31 least sometimes) cache the return value of program function calls. This cache
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
32 serves as a database of experimental results, indexed by the functions that
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
33 originally computed them. I think this is a very natural scheme for organizing
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
34 experiment results, and ensuring experiment reproducibility [R1].
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
35 At the same time, this is a clean and simple API behind which experiments can be
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
36 saved using a number of database technologies.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
37
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
38 APIs vs. lambda
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
39 ----------------
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
40
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
41 Modularity in general is achieved when pieces can be substituted one for the
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
42 other.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
43
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
44 In an object-oriented design, modularity is achieved by agreeing on interface
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
45 APIs, but in a functional design there is another possibility: the lambda.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
46
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
47 In an SE these pieces are expression [applications] and the subgraphs they form.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
48 A subgraph is characterized syntactically within the program by its arguments
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
49 and its return values. A lambda function allows the User to create new
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
50 Expression types from arbitrary subgraphs with very few keystrokes. When a
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
51 lambda is available and easy to use, there is much less pressure on the
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
52 expression library to follow calling and return conventions strictly.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
53
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
54 Of course, the closer are two subgraphs in terms of their inputs, outputs, and
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
55 semantics, the easier it is to substitute one for the other. As library
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
56 designers, we should still aim for compatibility of similar algorithms. It's
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
57 just not essential to choose an API that will guarantee a match, or indeed to
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
58 choose any explicit API at all.
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
59
4eda3f52ebef v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff changeset
60