Mercurial > pylearn
annotate doc/v2_planning/architecture.txt @ 1104:5e6d7d9e803a
a comment on the GPU issue for datasets
author | Razvan Pascanu <r.pascanu@gmail.com> |
---|---|
date | Mon, 13 Sep 2010 20:21:23 -0400 |
parents | 4eda3f52ebef |
children | e5306f5626d4 |
rev | line source |
---|---|
1098
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
1 ==================== |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
2 Pylearn Architecture |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
3 ==================== |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
4 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
5 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
6 Basic Design Approach |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
7 ===================== |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
8 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
9 I propose that the basic design of the library follow the Symbolic Expression |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
10 (SE) structure + virtual machine (VM) pattern that worked for Theano. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
11 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
12 So the main things for the library to provide would be: |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
13 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
14 - a few VMs, some of which can run programs in parallel across processors, |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
15 hosts, and networks [R6,R8]; |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
16 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
17 - MLA components as either individual Expressions (similar to Ops) or as |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
18 subgraphs of SEs [R5,R7,R10,R11] |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
19 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
20 - machine learning algorithms including their training and testing in the form |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
21 of python functions that build SE graphs.[R1,R8]. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
23 This design addresses R2 (modularity) because swapping components is literally implemented by |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
24 swapping subgraphs. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
25 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
26 The design addresses R9 (algorithmic efficiency) because we can write |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
27 Theano-style graph transformations to recognize special cases of component |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
28 combinations. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
29 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
30 The design addresses R3 if we make the additional decision that the VMs (at |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
31 least sometimes) cache the return value of program function calls. This cache |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
32 serves as a database of experimental results, indexed by the functions that |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
33 originally computed them. I think this is a very natural scheme for organizing |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
34 experiment results, and ensuring experiment reproducibility [R1]. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
35 At the same time, this is a clean and simple API behind which experiments can be |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
36 saved using a number of database technologies. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
37 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
38 APIs vs. lambda |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
39 ---------------- |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
40 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
41 Modularity in general is achieved when pieces can be substituted one for the |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
42 other. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
43 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
44 In an object-oriented design, modularity is achieved by agreeing on interface |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
45 APIs, but in a functional design there is another possibility: the lambda. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
46 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
47 In an SE these pieces are expression [applications] and the subgraphs they form. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
48 A subgraph is characterized syntactically within the program by its arguments |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
49 and its return values. A lambda function allows the User to create new |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
50 Expression types from arbitrary subgraphs with very few keystrokes. When a |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
51 lambda is available and easy to use, there is much less pressure on the |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
52 expression library to follow calling and return conventions strictly. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
53 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
54 Of course, the closer are two subgraphs in terms of their inputs, outputs, and |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
55 semantics, the easier it is to substitute one for the other. As library |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
56 designers, we should still aim for compatibility of similar algorithms. It's |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
57 just not essential to choose an API that will guarantee a match, or indeed to |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
58 choose any explicit API at all. |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
59 |
4eda3f52ebef
v2planning - revs to requirements, added architecture
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
60 |