changeset 1064:a41cc29cee26

v2planning optimization - API draft
author James Bergstra <bergstrj@iro.umontreal.ca>
date Thu, 09 Sep 2010 17:44:43 -0400
parents 074901ccf7b6
children 2bbc464d6ed0
files doc/v2_planning/api_optimization.txt doc/v2_planning/optimization.txt
diffstat 2 files changed, 108 insertions(+), 4 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/v2_planning/api_optimization.txt	Thu Sep 09 17:44:43 2010 -0400
@@ -0,0 +1,98 @@
+Optimization API
+================
+
+Members: Bergstra, Lamblin, Dellaleau, Glorot, Breuleux, Bordes
+Leader: Bergstra
+
+
+Description
+-----------
+
+We provide an API for iterative optimization algorithms, such as:
+
+ - stochastic gradient descent (incl. momentum, annealing)
+ - delta bar delta
+ - conjugate methods
+ - L-BFGS
+ - "Hessian Free"
+ - SGD-QN
+ - TONGA
+
+The API includes an iterative interface based on Theano, and a one-shot
+interface similar to SciPy and MATLAB that is based on Python and Numpy, that
+only uses Theano for the implementation.
+
+
+Iterative Interface
+-------------------
+
+def iterative_optimizer(parameters, 
+        cost=None,
+        grads=None,
+        stop=None, 
+        updates=None,
+        **kwargs):
+    """
+    :param parameters: list or tuple of Theano variables (typically shared vars)
+        that we want to optimize iteratively.  If we're minimizing f(x), then
+        together, these variables represent 'x'.
+
+    :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of
+        cost  (what are the conditions on the noise?).  Some algorithms might
+        need an exact cost, some algorithms might ignore the cost if the grads
+        are given.
+
+    :param grads: list or tuple of Theano variables representing the gradients on
+        the corresponding parameters.  These default to tensor.grad(cost,
+        parameters).
+
+    :param stop: a shared variable (scalar integer) that (if provided) will be
+        updated to say when the iterative minimization algorithm has finished
+        (1) or requires more iterations (0).
+
+    :param updates: a dictionary to update with the (var, new_value) items
+        associated with the iterative algorithm.  The default is a new empty
+        dictionary.  A KeyError is raised in case of key collisions.
+
+    :param kwargs: algorithm-dependent arguments
+
+    :returns: a dictionary mapping each parameter to an expression that it
+       should take in order to carry out the optimization procedure.
+
+       If all the parameters are shared variables, then this dictionary may be
+       passed as the ``updates`` argument to theano.function.
+
+       There may be more key,value pairs in the dictionary corresponding to
+       internal variables that are part of the optimization algorithm.
+
+    """
+
+
+One-shot Interface
+------------------
+
+def minimize(x0, f, df, opt_algo, **kwargs):
+    """
+    Return a point x_new that minimizes function `f` with derivative `df`.
+
+    This is supposed to provide an interface similar to scipy's minimize
+    routines, or MATLAB's.
+
+    :type x0: numpy ndarray
+    :param x0: starting point for minimization
+
+    :type f: python callable mapping something like x0 to a scalar
+    :param f: function to minimize
+
+    :type df: python callable mapping something like x0 to the derivative of f at that point
+    :param df: derivative of `f`
+
+    :param opt_algo: one of the functions that implements the
+    `iterative_optimizer` interface.
+
+    :param kwargs: passed through to `opt_algo`
+
+    """
+
+
+
--- a/doc/v2_planning/optimization.txt	Thu Sep 09 15:57:48 2010 -0400
+++ b/doc/v2_planning/optimization.txt	Thu Sep 09 17:44:43 2010 -0400
@@ -1,9 +1,15 @@
-Discussion of Optimization-Related Issues
+=========================
+Optimization for Learning
+=========================
+
+Members: Bergstra, Lamblin, Dellaleau, Glorot, Breuleux, Bordes
+Leader: Bergstra
+
+
+
+Initial Writeup by James
 =========================================
 
-Members: JB, PL, OD, XG
-
-Representative: JB
 
 
 Previous work - scikits, openopt, scipy  provide function optimization