# HG changeset patch
# User Frederic Bastien <nouiz@nouiz.org>
# Date 1284771318 14400
# Node ID 0e12ea6ba6612903bb646baf2eb67f3df93afe7a
# Parent  073c2fab7bcd5676f74df1420d334bbc992ae53a
fix many rst syntax error warning.

diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/existing_python_ml_libraries.txt
--- a/doc/v2_planning/existing_python_ml_libraries.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/existing_python_ml_libraries.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -6,7 +6,7 @@
 
  * How much should we try to interface with other libraries? 
  * What parts can we and should we implement ourselves and what should we leave
- to the other libraries?
+   to the other libraries?
 
 Preliminary list of libraries to look at:
 
@@ -22,5 +22,4 @@
  * scikits.learn Guillaume (but could trade)
 
 Also check out http://scipy.org/Topical_Software#head-fc5493250d285f5c634e51be7ba0f80d5f4d6443
-- scipy.org's ``topical software'' section on Artificial Intelligence and
-  Machine Learning
+- scipy.org's ``topical software'' section on Artificial Intelligence and Machine Learning
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/learner.txt
--- a/doc/v2_planning/learner.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/learner.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -9,33 +9,34 @@
 following semantics:
 
 * A learner has named hyper-parameters that control how it learns (these can be viewed
-as options of the constructor, or might be set directly by a user)
+  as options of the constructor, or might be set directly by a user)
 
 * A learner also has an internal state that depends on what it has learned.
 
 * A learner reads and produces data, so the definition of learner is
-intimately linked to the definition of dataset (and task).
+  intimately linked to the definition of dataset (and task).
 
 * A learner has one or more 'train' or 'adapt' functions by which
-it is given a sample of data (typically either the whole training set, or
-a mini-batch, which contains as a special case a single 'example'). Learners
-interface with datasets in order to obtain data. These functions cause the
-learner to change its internal state and take advantage to some extent
-of the data provided. The 'train' function should take charge of
-completely exploiting the dataset, as specified per the hyper-parameters,
-so that it would typically be called only once. An 'adapt' function
-is meant for learners that can operate in an 'online' setting where
-data continually arrive and the control loop (when to stop) is to
-be managed outside of it. For most intents and purposes, the
-'train' function could also handle the 'online' case by providing
-the controlled iterations over the dataset (which would then be
-seen as a stream of examples).
+  it is given a sample of data (typically either the whole training set, or
+  a mini-batch, which contains as a special case a single 'example'). Learners
+  interface with datasets in order to obtain data. These functions cause the
+  learner to change its internal state and take advantage to some extent
+  of the data provided. The 'train' function should take charge of
+  completely exploiting the dataset, as specified per the hyper-parameters,
+  so that it would typically be called only once. An 'adapt' function
+  is meant for learners that can operate in an 'online' setting where
+  data continually arrive and the control loop (when to stop) is to
+  be managed outside of it. For most intents and purposes, the
+  'train' function could also handle the 'online' case by providing
+  the controlled iterations over the dataset (which would then be
+  seen as a stream of examples).
+
     * learner.train(dataset)
     * learner.adapt(data)
 
 * Different types of learners can then exploit their internal state
-in order to perform various computations after training is completed,
-or in the middle of training, e.g.,
+  in order to perform various computations after training is completed,
+  or in the middle of training, e.g.,
 
    * y=learner.predict(x)
      for learners that see (x,y) pairs during training and predict y given x,
@@ -67,15 +68,18 @@
     * [prediction,costs] = learner.predict_and_adapt((x,y))
 
 * Some learners could include in their internal state not only what they
-have learned but some information about recently seen examples that conditions
-the expected distribution of upcoming examples. In that case, they might
-be used, e.g. in an online setting as follows:
+  have learned but some information about recently seen examples that conditions
+  the expected distribution of upcoming examples. In that case, they might
+  be used, e.g. in an online setting as follows:
+
+.. code-block:: python
+
      for (x,y) in data_stream:
         [prediction,costs]=learner.predict((x,y))
         accumulate_statistics(prediction,costs)
 
 * In some cases, each example is itself a (possibly variable-size) sequence
-or other variable-size object (e.g. an image, or a video)
+  or other variable-size object (e.g. an image, or a video)
 
 
 
@@ -187,6 +191,8 @@
     An object that allows us to explore the graph discussed above.  Specifically, it represents
     an explored node in that graph.
 
+.. code-block:: python
+
     def active_instructions()
         """ Return a list/set of Instruction instances (see below) that the Learner is prepared
         to handle.
@@ -207,6 +213,8 @@
     An object that represents a potential edge in the graph discussed above.  It is an
     operation that a learner can perform.
 
+.. code-block:: python
+
     arg_types
         """a list of Type object (see below) indicating what args are required by execute"""
 
@@ -228,6 +236,8 @@
     It is not necessary that a Type specifies exactly which arguments are legal, but it should
     `include` all legal arguments, and exclude as many illegal ones as possible.
 
+.. code-block:: python
+
     def includes(value):
         """return True if value is a legal argument"""
 
@@ -318,17 +328,23 @@
 I'm wondering what's the benefit of such an API compared to simply defining a
 new method for each instruction. It seems to me that typically, the 'execute'
 method would end up being something like
+
+.. code-block:: python
+
     if instruction == 'do_x':
         self.do_x(..)
     elif instruction == 'do_y':
         self.do_y(..)
     ...
+
 so why not directly call do_x / do_y instead?
 
 
 JB replies: I agree with you, and in the implementation of a Learner I suggest
 using Python decorators to get the best of both worlds:
     
+.. code-block:: python
+
     class NNet(Learner):
 
         ...
@@ -429,15 +445,16 @@
 I think having a DAG is useful in many ways (all this are things that one
 might think about implementing in a far future, I'm not proposing to implement
 them unless we want to use them - like the reconstruction ):
+
   * there exist the posibility of writing optimizations ( theano style ) 
   * there exist the posibility to add global view utility functions ( like 
-  a reconstruction function for SdA - extremely low level here), or global 
-  view diagnostic tools
+    a reconstruction function for SdA - extremely low level here), or global 
+    view diagnostic tools
   * the posibility of creating a GUI ( where you just create the Graph by
-  picking transforms and variables from a list ) or working interactively
-  and then generating code that will reproduce the graph 
+    picking transforms and variables from a list ) or working interactively
+    and then generating code that will reproduce the graph 
   * you can view the graph and different granularity levels to understand
-  things ( global diagnostics)
+    things ( global diagnostics)
 
 We should have a taxonomy of possible classes of functions and possible
 classes of variables, but those should not be exclusive. We can work at a high
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/main_plan.txt
--- a/doc/v2_planning/main_plan.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/main_plan.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -3,7 +3,7 @@
 ==========
 
 Yoshua (points discussed Thursday Sept 2, 2010 at LISA tea-talk)
-------
+----------------------------------------------------------------
 
 ****** Why we need to get better organized in our code-writing ******
 
@@ -151,6 +151,7 @@
 Another thing to consider related to datasets is that there are a number of
 other efforts to have standard ML datasets, and we should be aware of them,
 and compatible with them when it's easy:
+
  - mldata.org    (they have a file format, not sure how many use it)
  - weka          (ARFF file format)
  - scikits.learn 
@@ -168,10 +169,10 @@
 Yoshua (about ideas proposed by Pascal Vincent a while ago): 
 
   - we may want to distinguish between datasets and tasks: a task defines
-  not just the data but also things like what is the input and what is the
-  target (for supervised learning), and *importantly* a set of performance metrics
-  that make sense for this task (e.g. those used by papers solving a particular
-  task, or reported for a particular benchmark)
+    not just the data but also things like what is the input and what is the
+    target (for supervised learning), and *importantly* a set of performance metrics
+    that make sense for this task (e.g. those used by papers solving a particular
+    task, or reported for a particular benchmark)
 
   - we should discuss about a few "standards" that datasets and tasks may comply to, such as
     - "input" and "target" fields inside each example, for supervised or semi-supervised learning tasks
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/neural_net.txt
--- a/doc/v2_planning/neural_net.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/neural_net.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -11,7 +11,7 @@
 
 
 Objective ( Razvan)
----------
+-------------------
 
 Come up with a description of how to write learners ( how to combine
 optimizer, structure, error measure, how to talk to datasets, tasks ( if there
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/optimization.txt
--- a/doc/v2_planning/optimization.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/optimization.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -64,6 +64,7 @@
     matter if we are just wrapping a theano-based algorithm (that already has
     to handle multiple parameters), and avoiding useless data copies on each call
     to f / df can only help speed-wise.
+
 JB replies: Done, I added possibility that x0 is list of ndarrays to the api
 doc.
 
@@ -86,6 +87,8 @@
 OD: I wish we could get closer to each other the Theano and Numpy interfaces.
 It would be nice if we could do something like:
 
+.. code-block:: python
+
     # Theano version.
     updates = sgd([p], gradients=[g], stop=stop, step_size=.1)
     sgd_step = theano.function([input_var, target_var], [], updates=updates)
@@ -101,6 +104,8 @@
 
 where sgd would look something like:
 
+.. code-block:: python
+
     class sgd(...):
         def __init__(self, parameters, cost=None, gradients=None, stop=None,
                      step_size=None):
@@ -117,6 +122,8 @@
 
 Then a wrapper to provide a scipy-like interface could be:
 
+.. code-block:: python
+
     def minimize(x0, f, df, algo, **kw):
         stop = numpy.array(0, dtype=numpy.int8)
         algo_step = eval(algo)([x0], cost=f, gradients=lambda x: (df(x), ),
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/sampler.txt
--- a/doc/v2_planning/sampler.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/sampler.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -44,6 +44,6 @@
 =================
 
 * MCMC methods have a usage pattern that is quite different from the kind of univariate sampling methods
-needed for nice-and-easy parametric families. 
+  needed for nice-and-easy parametric families. 
 
 
diff -r 073c2fab7bcd -r 0e12ea6ba661 doc/v2_planning/use_cases.txt
--- a/doc/v2_planning/use_cases.txt	Fri Sep 17 20:24:30 2010 -0400
+++ b/doc/v2_planning/use_cases.txt	Fri Sep 17 20:55:18 2010 -0400
@@ -56,8 +56,9 @@
 
 There are many ways that the training could be configured, but here is one:
 
+.. code-block:: python
 
-vm.call(
+  vm.call(
     halflife_stopper(
         # OD: is n_hidden supposed to be n_classes instead?
         initial_model=random_linear_classifier(MNIST.n_inputs, MNIST.n_hidden, r_seed=234432),
@@ -108,22 +109,25 @@
 regularly had issues in PLearn with the fact we had for instance to give the
 number of inputs when creating a neural network. I much prefer when this kind
 of thing can be figured out at runtime:
-    - Any parameter you can get rid of is a significant gain in
-      user-friendliness.
-    - It's not always easy to know in advance e.g. the dimension of your input
-      dataset. Imagine for instance this dataset is obtained in a first step
-      by going through a PCA whose number of output dimensions is set so as to
-      keep 90% of the variance.
-    - It seems to me it fits better the idea of a symbolic graph: my intuition
-      (that may be very different from what you actually have in mind) is to
-      see an experiment as a symbolic graph, which you instantiate when you
-      provide the input data. One advantage of this point of view is it makes
-      it natural to re-use the same block components on various datasets /
-      splits, something we often want to do.
+
+- Any parameter you can get rid of is a significant gain in
+  user-friendliness.
+- It's not always easy to know in advance e.g. the dimension of your input
+  dataset. Imagine for instance this dataset is obtained in a first step
+  by going through a PCA whose number of output dimensions is set so as to
+  keep 90% of the variance.
+- It seems to me it fits better the idea of a symbolic graph: my intuition
+  (that may be very different from what you actually have in mind) is to
+  see an experiment as a symbolic graph, which you instantiate when you
+  provide the input data. One advantage of this point of view is it makes
+  it natural to re-use the same block components on various datasets /
+  splits, something we often want to do.
 
 K-fold cross validation of a classifier
 ---------------------------------------
 
+.. code-block:: python
+
     splits = kfold_cross_validate(
         # OD: What would these parameters mean?
         indexlist = range(1000)