Mercurial > pylearn
annotate doc/v2_planning/datalearn_pytables.txt @ 1470:94268a161925
memmap support in Dataset op
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 18 May 2011 10:50:21 -0400 |
parents | 1934ba31b7d9 |
children |
rev | line source |
---|---|
1396
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
1 Big Dataset/Output |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
2 ================== |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
3 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
4 This file show the current plan on how we plan to fix the problem of dataset/output that don't fit in memory with PyTables. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
5 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
6 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
7 PyTables |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
8 -------- |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
9 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
10 We try to fix that problem by allowing to use Pytables in/with Theano. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
11 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
12 Here is the ticket that I plan to do. They are in order. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
13 |
1397 | 14 - Fix ift6266 script to load PNIST data (DONE) |
1398
1934ba31b7d9
rst syntax fix and update to what is currently done.
Frederic Bastien <nouiz@nouiz.org>
parents:
1397
diff
changeset
|
15 - Put PNIST in PyTable format (Done a first version) |
1934ba31b7d9
rst syntax fix and update to what is currently done.
Frederic Bastien <nouiz@nouiz.org>
parents:
1397
diff
changeset
|
16 - example with PyTable data in python (Done with modif needed in Theano) |
1397 | 17 - basic filter in the dataset in python |
18 - example with PyTable output in python | |
19 - ? put stats in the pytables to don't read the file each time to normalize | |
20 Maybe we will try another mechanism. But I will start with this one as it seam simple to do. | |
21 - example with PyTable data in theano | |
22 - basic filter in the dataset in theano | |
23 - example with PyTable output in theano | |
24 - plan a way to store the output temporarily(delete it and store locally) | |
1396
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
25 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
26 The 1,2,3 are their to verify that PyTable can do what we want! |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
27 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
28 Here is the current plan. |
1397 | 29 - Make a PyTableVariable |
30 - Make a PyTableSubtensor op(or reuse the current one) to allow take a | |
31 slice on the new variable | |
32 - Allow scan to work with this new PyTableVariable in input | |
33 - Allow scan to work with this new PyTableVariable in output | |
1396
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
34 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
35 At first no inplace op. So no view_map and destory_map won't work. The way I see it is as an interface to a file inside theano. No direct modification allowed first. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
36 |
1397 | 37 Getting the code |
38 ---------------- | |
39 | |
40 - clone FB repo: | |
41 | |
42 .. code-block:: bash | |
1398
1934ba31b7d9
rst syntax fix and update to what is currently done.
Frederic Bastien <nouiz@nouiz.org>
parents:
1397
diff
changeset
|
43 |
1397 | 44 git clone git@github.com:nouiz/pylearn.git Pylearn.nouiz |
45 | |
46 - create a local branch that track the remote branch | |
47 | |
48 .. code-block:: bash | |
1398
1934ba31b7d9
rst syntax fix and update to what is currently done.
Frederic Bastien <nouiz@nouiz.org>
parents:
1397
diff
changeset
|
49 |
1397 | 50 git checkout -b variants origin/pytables |