changeset 1100:153cf820a975

v2planning - updates to api_optimization
author James Bergstra <bergstrj@iro.umontreal.ca>
date Mon, 13 Sep 2010 13:41:53 -0400
parents 0b666177f725
children b422cbaddc52
files doc/v2_planning/api_optimization.txt
diffstat 1 files changed, 45 insertions(+), 12 deletions(-) [+]
line wrap: on
line diff
--- a/doc/v2_planning/api_optimization.txt	Mon Sep 13 09:47:30 2010 -0400
+++ b/doc/v2_planning/api_optimization.txt	Mon Sep 13 13:41:53 2010 -0400
@@ -23,26 +23,33 @@
 only uses Theano for the implementation.
 
 
-Iterative Interface
--------------------
+Theano Interface
+-----------------
+
+The theano interface to optimization algorithms is to ask for a dictionary of
+updates that can be used in theano.function.  Implementations of iterative
+optimization algorithms should be global functions with a signature like
+'iterative_optimizer'.
 
 def iterative_optimizer(parameters, 
         cost=None,
-        grads=None,
+        gradients=None,
         stop=None, 
         updates=None,
         **kwargs):
     """
-    :param parameters: list or tuple of Theano variables (typically shared vars)
+    :param parameters: list or tuple of Theano variables 
         that we want to optimize iteratively.  If we're minimizing f(x), then
-        together, these variables represent 'x'.
+        together, these variables represent 'x'.  Typically these are shared
+        variables and their values are the initial values for the minimization
+        algorithm.
 
     :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of
         cost  (what are the conditions on the noise?).  Some algorithms might
-        need an exact cost, some algorithms might ignore the cost if the grads
-        are given.
+        need an exact cost, some algorithms might ignore the cost if the
+        gradients are given.
 
-    :param grads: list or tuple of Theano variables representing the gradients on
+    :param gradients: list or tuple of Theano variables representing the gradients on
         the corresponding parameters.  These default to tensor.grad(cost,
         parameters).
 
@@ -68,8 +75,16 @@
     """
 
 
-One-shot Interface
-------------------
+Numpy Interface
+---------------
+
+The numpy interface to optimization algorithms is supposed to mimick
+scipy's.  Its arguments are numpy arrays, and functions that manipulate numpy
+arrays.
+
+TODO: There is also room for an iterative object (that doesn't hog program
+control) but which nonetheless works on numpy objects.  Actually minimize() should
+use this iterative interface under the hood.
 
 def minimize(x0, f, df, opt_algo, **kwargs):
     """
@@ -94,12 +109,30 @@
 
     """
 
-OD: Could it be more convenient for x0 to be a list?
+OD asks: Could it be more convenient for x0 to be a list?
+ 
+JB replies: Yes, but that's not the interface used by other minimize()
+routines (e.g. in scipy).  Maybe another list-based interface is required?
 
-OD: Why make a difference between iterative and one-shot versions? A one-shot
+
+OD asks: Why make a difference between iterative and one-shot versions? A one-shot
     algorithm can be seen as an iterative one that stops after its first
     iteration. The difference I see between the two interfaces proposed here
     is mostly that one relies on Theano while the other one does not, but
     hopefully a non-Theano one can be created by simply wrapping around the
     Theano one.
 
+JB replies: Right, it would make more sense to distinguish them by the fact that
+one works on Theano objects, and the other on general Python callable functions.
+There is room for an iterative numpy interface, but I didn't make it yet.  Would
+that answer your question?
+
+
+
+Examples
+--------
+
+
+Simple stochastic gradient descent with extra updates:
+
+sgd([p], gradients=[g], updates={a:b}, step_size=.1) will return {a:b, p:p-.1*g}