# HG changeset patch # User James Bergstra # Date 1284399713 14400 # Node ID 153cf820a97503bfc3d539fe9b9f738edf1e4370 # Parent 0b666177f7253ade53f52c42d0307b1344149976 v2planning - updates to api_optimization diff -r 0b666177f725 -r 153cf820a975 doc/v2_planning/api_optimization.txt --- a/doc/v2_planning/api_optimization.txt Mon Sep 13 09:47:30 2010 -0400 +++ b/doc/v2_planning/api_optimization.txt Mon Sep 13 13:41:53 2010 -0400 @@ -23,26 +23,33 @@ only uses Theano for the implementation. -Iterative Interface -------------------- +Theano Interface +----------------- + +The theano interface to optimization algorithms is to ask for a dictionary of +updates that can be used in theano.function. Implementations of iterative +optimization algorithms should be global functions with a signature like +'iterative_optimizer'. def iterative_optimizer(parameters, cost=None, - grads=None, + gradients=None, stop=None, updates=None, **kwargs): """ - :param parameters: list or tuple of Theano variables (typically shared vars) + :param parameters: list or tuple of Theano variables that we want to optimize iteratively. If we're minimizing f(x), then - together, these variables represent 'x'. + together, these variables represent 'x'. Typically these are shared + variables and their values are the initial values for the minimization + algorithm. :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of cost (what are the conditions on the noise?). Some algorithms might - need an exact cost, some algorithms might ignore the cost if the grads - are given. + need an exact cost, some algorithms might ignore the cost if the + gradients are given. - :param grads: list or tuple of Theano variables representing the gradients on + :param gradients: list or tuple of Theano variables representing the gradients on the corresponding parameters. These default to tensor.grad(cost, parameters). @@ -68,8 +75,16 @@ """ -One-shot Interface ------------------- +Numpy Interface +--------------- + +The numpy interface to optimization algorithms is supposed to mimick +scipy's. Its arguments are numpy arrays, and functions that manipulate numpy +arrays. + +TODO: There is also room for an iterative object (that doesn't hog program +control) but which nonetheless works on numpy objects. Actually minimize() should +use this iterative interface under the hood. def minimize(x0, f, df, opt_algo, **kwargs): """ @@ -94,12 +109,30 @@ """ -OD: Could it be more convenient for x0 to be a list? +OD asks: Could it be more convenient for x0 to be a list? + +JB replies: Yes, but that's not the interface used by other minimize() +routines (e.g. in scipy). Maybe another list-based interface is required? -OD: Why make a difference between iterative and one-shot versions? A one-shot + +OD asks: Why make a difference between iterative and one-shot versions? A one-shot algorithm can be seen as an iterative one that stops after its first iteration. The difference I see between the two interfaces proposed here is mostly that one relies on Theano while the other one does not, but hopefully a non-Theano one can be created by simply wrapping around the Theano one. +JB replies: Right, it would make more sense to distinguish them by the fact that +one works on Theano objects, and the other on general Python callable functions. +There is room for an iterative numpy interface, but I didn't make it yet. Would +that answer your question? + + + +Examples +-------- + + +Simple stochastic gradient descent with extra updates: + +sgd([p], gradients=[g], updates={a:b}, step_size=.1) will return {a:b, p:p-.1*g}