changeset 1187:7d34edde029d

added serializability requiremnt
author James Bergstra <bergstrj@iro.umontreal.ca>
date Fri, 17 Sep 2010 17:53:35 -0400
parents f111f8c2a280
children 073c2fab7bcd
files doc/v2_planning/requirements.txt
diffstat 1 files changed, 24 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/doc/v2_planning/requirements.txt	Fri Sep 17 17:07:52 2010 -0400
+++ b/doc/v2_planning/requirements.txt	Fri Sep 17 17:53:35 2010 -0400
@@ -86,3 +86,27 @@
 not be obvious anymore how to remove the k-fold split in the saved model you
 want to use in production.
 
+
+Requirements for component architecture
+=======================================
+
+
+R14.  Serializability of experiments. (essentially in pursuit of R6)
+
+Jobs that are running a learning algorithm with our components (datasets,
+models, algorithms) must be able to serialize the experiment's state to a string
+(typically written to disk) and be able to restart it from such a string.  There
+must be a mechanism to tell a job to serialize the experiment as soon as
+possible, and a latency of up to 10 seconds should be acceptable.  It must also
+be possible to deserialize the experiment for introspection (inspect the state
+of individual components), not just for continuing the experiment.  The
+experiment can assume that resources on disk that were present when the
+experiment started will be present when the experiment resumes.  The experiment
+cannot assume that resources written by the experiment will still be there (e.g.
+in /tmp or cwd).  Implementations should make an effort to make the serialized
+representation compact, when it is possible to recompute or reload from disk
+at deserialization time.
+
+This requirement is aimed at enabling process migration and job control as well
+as post-hoc analysis of experiment results.
+