Mercurial > pylearn
annotate doc/v2_planning/datalearn_pytables.txt @ 1396:310e22d7e44b
new file about datalearn in pytables.
author | Frederic Bastien <nouiz@nouiz.org> |
---|---|
date | Mon, 10 Jan 2011 14:55:39 -0500 |
parents | |
children | 702a933794f7 |
rev | line source |
---|---|
1396
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
1 Big Dataset/Output |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
2 ================== |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
3 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
4 This file show the current plan on how we plan to fix the problem of dataset/output that don't fit in memory with PyTables. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
5 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
6 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
7 PyTables |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
8 -------- |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
9 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
10 We try to fix that problem by allowing to use Pytables in/with Theano. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
11 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
12 Here is the ticket that I plan to do. They are in order. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
13 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
14 -1) Fix ift6266 script to load PNIST data (DONE) |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
15 0) Put PNIST in PyTable format |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
16 1) example with PyTable data in python |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
17 2) basic filter in the dataset in python |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
18 3) example with PyTable output in python |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
19 ?) put stats in the pytables to don't read the file each time to normalize |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
20 Maybe we will try another mechanism. But I will start with this one as it seam simple to do. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
21 4) example with PyTable data in theano |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
22 5) basic filter in the dataset in theano |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
23 6) example with PyTable output in theano |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
24 7) plan a way to store the output temporarily(delete it and store locally) |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
25 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
26 The 1,2,3 are their to verify that PyTable can do what we want! |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
27 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
28 Here is the current plan. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
29 - Make a PyTableVariable |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
30 - Make a PyTableSubtensor op(or reuse the current one) to allow take a |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
31 slice on the new variable |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
32 - Allow scan to work with this new PyTableVariable in input |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
33 - Allow scan to work with this new PyTableVariable in output |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
34 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
35 At first no inplace op. So no view_map and destory_map won't work. The way I see it is as an interface to a file inside theano. No direct modification allowed first. |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
36 |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
37 #clone OD repo |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
38 git clone git@github.com:nouiz/pylearn.git Pylearn.nouiz |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
39 #create a local branch that track the remote branch |
310e22d7e44b
new file about datalearn in pytables.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff
changeset
|
40 git checkout -b variants origin/pytables |