annotate doc/v2_planning/arch_FB.txt @ 1291:ea923a06dea6

added my architecture proposal.
author Frederic Bastien <nouiz@nouiz.org>
date Thu, 30 Sep 2010 10:16:35 -0400
parents
children abc7a7e22ead
rev   line source
1291
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
1 Current and extenstion of our framework
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
2 =======================================
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
3
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
4 Supposition I make:
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
5
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
6 * Dataset, Learner and Layers commity have done their work
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
7 * That mean we have a more easy way to make a learning model.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
8 * Checkpoint solved: we ignore(short jobs), don't care, manual checkpoint, structured checkpoint with an example or use OB system.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
9
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
10 Example MLP
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
11 -----------
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
12
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
13 * Select the hyper parameter search space with `jobman sqlschedules`
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
14 * Dispatch the jobs with dbidispatch
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
15 * *Manually*(fixable) reset jobs status to START.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
16 * I started it, but I will change the syntax to make it more generic.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
17 * *Manually* relaunch crashed jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
18 * *Manually*(fixable) analyse/visualise the result. (We need to start those meeting at some point)
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
19
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
20 Example MLP+cross validataion
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
21 -----------------------------
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
22
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
23 * Modify the dataset interface to accept 2 new hyper parameter: nb_cross_fold=X, id_cross_fold.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
24 * Schedule all of the fold to run in parallel with `jobman sqlschedules`
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
25 * *Manually* (fixable) reset jobs status to START.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
26 * *Manually* relaunch crashed jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
27 * *Manually* (fixable) analyse/visualize the result.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
28 * Those tools need to understand the concept of cross validation
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
29 * *Manually* (fixable with proposition bellow) launch a retrain on the full dataset with the best hyper parameter
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
30
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
31
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
32 Example DBN
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
33 -----------
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
34
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
35 * *Concept* JOB PHASE. DBN( unsupervised and supervised)
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
36 * We suppose the job script have a parameter to tell him witch phase it should do.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
37 * *Jobman Extension* We can extend jobman to handle dependency between jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
38 * Proposed syntax:
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
39
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
40 .. code-block::
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
41
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
42 jobman sqlschedule p0={{}} ... -- p1={{}} ... -- p2=...
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
43
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
44 * The parameter before the first `--` tell on witch jobs the new jobs depends. (allow to depend on many jobs at the same time)
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
45 * The parameter between `--` tell that we want to create a new group of jobs for all those jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
46 * The parameter after the second `--` tell the new jobs to be create for each new group of jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
47
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
48 * *Jobman Extension* create `jobman dispatch`
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
49 * This will dispatch new jobs to run on the cluster with dbidispatch when a jobs have his dependency finished.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
50 * *Jobman Extension* create `jobman monitor`
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
51 * This repeadly call `jobman condor_check` to print jobs that can potentially have crashed and print them on the screen. It need to filter the output of condor_check.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
52 * Can create other `jobman CLUSTER_check` for mammouth,colosse,angel,...
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
53 * *Jobman Extension* when we change the status of a job to START in jobman, change the status of the jobs that depend on it at the same time.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
54 * *Jobman Extension* determine if a job finished correctly or not
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
55 * If a job did not finish correctly don't start the following jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
56 * *Jobman Policy* All change to the db should be doable by jobman command.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
57
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
58 * *Manually* relaunch crashed jobs.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
59 * *Manually*(fixable) analyse/visualise the result.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
60 * Those tools need to understand the concept of job phase or be agnostic of that.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
61
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
62
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
63 * *Cross validataion retrain* can be done with an additional phase in the extensions.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
64 * The new job need to know how to determine the best hyper parameter from the result.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
65
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
66
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
67 * This can be extended for double cross validation.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
68 * Dataset must support double cross validation
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
69 * We create more phase in jobman.
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
70
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
71 Hyper parameter search in Pylearn
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
72 ---------------------------------
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
73
ea923a06dea6 added my architecture proposal.
Frederic Bastien <nouiz@nouiz.org>
parents:
diff changeset
74 We would want to have the hyper parameter search being done in pylearn in some case. This will add a dependency on jobman. We can finish/verify how jobman work with sqlite to don't have request an installed db. sqlite is included in python 2.5. Jobman request python 2.5. We could make optional the jobman dependency on sqlalchemy when we use sqlite to limit the number of dependency.