comparison doc/v2_planning/api_optimization.txt @ 1100:153cf820a975

v2planning - updates to api_optimization
author James Bergstra <bergstrj@iro.umontreal.ca>
date Mon, 13 Sep 2010 13:41:53 -0400
parents 16ea3e5c5a7a
children e7c52923f122
comparison
equal deleted inserted replaced
1099:0b666177f725 1100:153cf820a975
21 The API includes an iterative interface based on Theano, and a one-shot 21 The API includes an iterative interface based on Theano, and a one-shot
22 interface similar to SciPy and MATLAB that is based on Python and Numpy, that 22 interface similar to SciPy and MATLAB that is based on Python and Numpy, that
23 only uses Theano for the implementation. 23 only uses Theano for the implementation.
24 24
25 25
26 Iterative Interface 26 Theano Interface
27 ------------------- 27 -----------------
28
29 The theano interface to optimization algorithms is to ask for a dictionary of
30 updates that can be used in theano.function. Implementations of iterative
31 optimization algorithms should be global functions with a signature like
32 'iterative_optimizer'.
28 33
29 def iterative_optimizer(parameters, 34 def iterative_optimizer(parameters,
30 cost=None, 35 cost=None,
31 grads=None, 36 gradients=None,
32 stop=None, 37 stop=None,
33 updates=None, 38 updates=None,
34 **kwargs): 39 **kwargs):
35 """ 40 """
36 :param parameters: list or tuple of Theano variables (typically shared vars) 41 :param parameters: list or tuple of Theano variables
37 that we want to optimize iteratively. If we're minimizing f(x), then 42 that we want to optimize iteratively. If we're minimizing f(x), then
38 together, these variables represent 'x'. 43 together, these variables represent 'x'. Typically these are shared
44 variables and their values are the initial values for the minimization
45 algorithm.
39 46
40 :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of 47 :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of
41 cost (what are the conditions on the noise?). Some algorithms might 48 cost (what are the conditions on the noise?). Some algorithms might
42 need an exact cost, some algorithms might ignore the cost if the grads 49 need an exact cost, some algorithms might ignore the cost if the
43 are given. 50 gradients are given.
44 51
45 :param grads: list or tuple of Theano variables representing the gradients on 52 :param gradients: list or tuple of Theano variables representing the gradients on
46 the corresponding parameters. These default to tensor.grad(cost, 53 the corresponding parameters. These default to tensor.grad(cost,
47 parameters). 54 parameters).
48 55
49 :param stop: a shared variable (scalar integer) that (if provided) will be 56 :param stop: a shared variable (scalar integer) that (if provided) will be
50 updated to say when the iterative minimization algorithm has finished 57 updated to say when the iterative minimization algorithm has finished
66 internal variables that are part of the optimization algorithm. 73 internal variables that are part of the optimization algorithm.
67 74
68 """ 75 """
69 76
70 77
71 One-shot Interface 78 Numpy Interface
72 ------------------ 79 ---------------
80
81 The numpy interface to optimization algorithms is supposed to mimick
82 scipy's. Its arguments are numpy arrays, and functions that manipulate numpy
83 arrays.
84
85 TODO: There is also room for an iterative object (that doesn't hog program
86 control) but which nonetheless works on numpy objects. Actually minimize() should
87 use this iterative interface under the hood.
73 88
74 def minimize(x0, f, df, opt_algo, **kwargs): 89 def minimize(x0, f, df, opt_algo, **kwargs):
75 """ 90 """
76 Return a point x_new that minimizes function `f` with derivative `df`. 91 Return a point x_new that minimizes function `f` with derivative `df`.
77 92
92 107
93 :param kwargs: passed through to `opt_algo` 108 :param kwargs: passed through to `opt_algo`
94 109
95 """ 110 """
96 111
97 OD: Could it be more convenient for x0 to be a list? 112 OD asks: Could it be more convenient for x0 to be a list?
113
114 JB replies: Yes, but that's not the interface used by other minimize()
115 routines (e.g. in scipy). Maybe another list-based interface is required?
98 116
99 OD: Why make a difference between iterative and one-shot versions? A one-shot 117
118 OD asks: Why make a difference between iterative and one-shot versions? A one-shot
100 algorithm can be seen as an iterative one that stops after its first 119 algorithm can be seen as an iterative one that stops after its first
101 iteration. The difference I see between the two interfaces proposed here 120 iteration. The difference I see between the two interfaces proposed here
102 is mostly that one relies on Theano while the other one does not, but 121 is mostly that one relies on Theano while the other one does not, but
103 hopefully a non-Theano one can be created by simply wrapping around the 122 hopefully a non-Theano one can be created by simply wrapping around the
104 Theano one. 123 Theano one.
105 124
125 JB replies: Right, it would make more sense to distinguish them by the fact that
126 one works on Theano objects, and the other on general Python callable functions.
127 There is room for an iterative numpy interface, but I didn't make it yet. Would
128 that answer your question?
129
130
131
132 Examples
133 --------
134
135
136 Simple stochastic gradient descent with extra updates:
137
138 sgd([p], gradients=[g], updates={a:b}, step_size=.1) will return {a:b, p:p-.1*g}