view doc/v2_planning/api_optimization.txt @ 1074:ee7f34fc98fe

Merged
author Olivier Delalleau <delallea@iro>
date Fri, 10 Sep 2010 11:38:40 -0400
parents 16ea3e5c5a7a
children 153cf820a975
line wrap: on
line source

Optimization API
================

Members: Bergstra, Lamblin, Delalleau, Glorot, Breuleux, Bordes
Leader: Bergstra


Description
-----------

This API is for iterative optimization algorithms, such as:

 - stochastic gradient descent (incl. momentum, annealing)
 - delta bar delta
 - conjugate methods
 - L-BFGS
 - "Hessian Free"
 - SGD-QN
 - TONGA

The API includes an iterative interface based on Theano, and a one-shot
interface similar to SciPy and MATLAB that is based on Python and Numpy, that
only uses Theano for the implementation.


Iterative Interface
-------------------

def iterative_optimizer(parameters, 
        cost=None,
        grads=None,
        stop=None, 
        updates=None,
        **kwargs):
    """
    :param parameters: list or tuple of Theano variables (typically shared vars)
        that we want to optimize iteratively.  If we're minimizing f(x), then
        together, these variables represent 'x'.

    :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of
        cost  (what are the conditions on the noise?).  Some algorithms might
        need an exact cost, some algorithms might ignore the cost if the grads
        are given.

    :param grads: list or tuple of Theano variables representing the gradients on
        the corresponding parameters.  These default to tensor.grad(cost,
        parameters).

    :param stop: a shared variable (scalar integer) that (if provided) will be
        updated to say when the iterative minimization algorithm has finished
        (1) or requires more iterations (0).

    :param updates: a dictionary to update with the (var, new_value) items
        associated with the iterative algorithm.  The default is a new empty
        dictionary.  A KeyError is raised in case of key collisions.

    :param kwargs: algorithm-dependent arguments

    :returns: a dictionary mapping each parameter to an expression that it
       should take in order to carry out the optimization procedure.

       If all the parameters are shared variables, then this dictionary may be
       passed as the ``updates`` argument to theano.function.

       There may be more key,value pairs in the dictionary corresponding to
       internal variables that are part of the optimization algorithm.

    """


One-shot Interface
------------------

def minimize(x0, f, df, opt_algo, **kwargs):
    """
    Return a point x_new that minimizes function `f` with derivative `df`.

    This is supposed to provide an interface similar to scipy's minimize
    routines, or MATLAB's.

    :type x0: numpy ndarray
    :param x0: starting point for minimization

    :type f: python callable mapping something like x0 to a scalar
    :param f: function to minimize

    :type df: python callable mapping something like x0 to the derivative of f at that point
    :param df: derivative of `f`

    :param opt_algo: one of the functions that implements the
    `iterative_optimizer` interface.

    :param kwargs: passed through to `opt_algo`

    """

OD: Could it be more convenient for x0 to be a list?

OD: Why make a difference between iterative and one-shot versions? A one-shot
    algorithm can be seen as an iterative one that stops after its first
    iteration. The difference I see between the two interfaces proposed here
    is mostly that one relies on Theano while the other one does not, but
    hopefully a non-Theano one can be created by simply wrapping around the
    Theano one.