comparison doc/v2_planning/requirements.txt @ 1187:7d34edde029d

added serializability requiremnt
author James Bergstra <bergstrj@iro.umontreal.ca>
date Fri, 17 Sep 2010 17:53:35 -0400
parents 1f5465622394
children ab80ba052d32
comparison
equal deleted inserted replaced
1186:f111f8c2a280 1187:7d34edde029d
84 the graph is modified to take advantage of the fact that k-fold validation can 84 the graph is modified to take advantage of the fact that k-fold validation can
85 be performed efficiently internally by some specific algorithm. Then it may 85 be performed efficiently internally by some specific algorithm. Then it may
86 not be obvious anymore how to remove the k-fold split in the saved model you 86 not be obvious anymore how to remove the k-fold split in the saved model you
87 want to use in production. 87 want to use in production.
88 88
89
90 Requirements for component architecture
91 =======================================
92
93
94 R14. Serializability of experiments. (essentially in pursuit of R6)
95
96 Jobs that are running a learning algorithm with our components (datasets,
97 models, algorithms) must be able to serialize the experiment's state to a string
98 (typically written to disk) and be able to restart it from such a string. There
99 must be a mechanism to tell a job to serialize the experiment as soon as
100 possible, and a latency of up to 10 seconds should be acceptable. It must also
101 be possible to deserialize the experiment for introspection (inspect the state
102 of individual components), not just for continuing the experiment. The
103 experiment can assume that resources on disk that were present when the
104 experiment started will be present when the experiment resumes. The experiment
105 cannot assume that resources written by the experiment will still be there (e.g.
106 in /tmp or cwd). Implementations should make an effort to make the serialized
107 representation compact, when it is possible to recompute or reload from disk
108 at deserialization time.
109
110 This requirement is aimed at enabling process migration and job control as well
111 as post-hoc analysis of experiment results.
112