Mercurial > pylearn
comparison doc/v2_planning/api_optimization.txt @ 1100:153cf820a975
v2planning - updates to api_optimization
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Mon, 13 Sep 2010 13:41:53 -0400 |
parents | 16ea3e5c5a7a |
children | e7c52923f122 |
comparison
equal
deleted
inserted
replaced
1099:0b666177f725 | 1100:153cf820a975 |
---|---|
21 The API includes an iterative interface based on Theano, and a one-shot | 21 The API includes an iterative interface based on Theano, and a one-shot |
22 interface similar to SciPy and MATLAB that is based on Python and Numpy, that | 22 interface similar to SciPy and MATLAB that is based on Python and Numpy, that |
23 only uses Theano for the implementation. | 23 only uses Theano for the implementation. |
24 | 24 |
25 | 25 |
26 Iterative Interface | 26 Theano Interface |
27 ------------------- | 27 ----------------- |
28 | |
29 The theano interface to optimization algorithms is to ask for a dictionary of | |
30 updates that can be used in theano.function. Implementations of iterative | |
31 optimization algorithms should be global functions with a signature like | |
32 'iterative_optimizer'. | |
28 | 33 |
29 def iterative_optimizer(parameters, | 34 def iterative_optimizer(parameters, |
30 cost=None, | 35 cost=None, |
31 grads=None, | 36 gradients=None, |
32 stop=None, | 37 stop=None, |
33 updates=None, | 38 updates=None, |
34 **kwargs): | 39 **kwargs): |
35 """ | 40 """ |
36 :param parameters: list or tuple of Theano variables (typically shared vars) | 41 :param parameters: list or tuple of Theano variables |
37 that we want to optimize iteratively. If we're minimizing f(x), then | 42 that we want to optimize iteratively. If we're minimizing f(x), then |
38 together, these variables represent 'x'. | 43 together, these variables represent 'x'. Typically these are shared |
44 variables and their values are the initial values for the minimization | |
45 algorithm. | |
39 | 46 |
40 :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of | 47 :param cost: scalar-valued Theano variable that computes an exact or noisy estimate of |
41 cost (what are the conditions on the noise?). Some algorithms might | 48 cost (what are the conditions on the noise?). Some algorithms might |
42 need an exact cost, some algorithms might ignore the cost if the grads | 49 need an exact cost, some algorithms might ignore the cost if the |
43 are given. | 50 gradients are given. |
44 | 51 |
45 :param grads: list or tuple of Theano variables representing the gradients on | 52 :param gradients: list or tuple of Theano variables representing the gradients on |
46 the corresponding parameters. These default to tensor.grad(cost, | 53 the corresponding parameters. These default to tensor.grad(cost, |
47 parameters). | 54 parameters). |
48 | 55 |
49 :param stop: a shared variable (scalar integer) that (if provided) will be | 56 :param stop: a shared variable (scalar integer) that (if provided) will be |
50 updated to say when the iterative minimization algorithm has finished | 57 updated to say when the iterative minimization algorithm has finished |
66 internal variables that are part of the optimization algorithm. | 73 internal variables that are part of the optimization algorithm. |
67 | 74 |
68 """ | 75 """ |
69 | 76 |
70 | 77 |
71 One-shot Interface | 78 Numpy Interface |
72 ------------------ | 79 --------------- |
80 | |
81 The numpy interface to optimization algorithms is supposed to mimick | |
82 scipy's. Its arguments are numpy arrays, and functions that manipulate numpy | |
83 arrays. | |
84 | |
85 TODO: There is also room for an iterative object (that doesn't hog program | |
86 control) but which nonetheless works on numpy objects. Actually minimize() should | |
87 use this iterative interface under the hood. | |
73 | 88 |
74 def minimize(x0, f, df, opt_algo, **kwargs): | 89 def minimize(x0, f, df, opt_algo, **kwargs): |
75 """ | 90 """ |
76 Return a point x_new that minimizes function `f` with derivative `df`. | 91 Return a point x_new that minimizes function `f` with derivative `df`. |
77 | 92 |
92 | 107 |
93 :param kwargs: passed through to `opt_algo` | 108 :param kwargs: passed through to `opt_algo` |
94 | 109 |
95 """ | 110 """ |
96 | 111 |
97 OD: Could it be more convenient for x0 to be a list? | 112 OD asks: Could it be more convenient for x0 to be a list? |
113 | |
114 JB replies: Yes, but that's not the interface used by other minimize() | |
115 routines (e.g. in scipy). Maybe another list-based interface is required? | |
98 | 116 |
99 OD: Why make a difference between iterative and one-shot versions? A one-shot | 117 |
118 OD asks: Why make a difference between iterative and one-shot versions? A one-shot | |
100 algorithm can be seen as an iterative one that stops after its first | 119 algorithm can be seen as an iterative one that stops after its first |
101 iteration. The difference I see between the two interfaces proposed here | 120 iteration. The difference I see between the two interfaces proposed here |
102 is mostly that one relies on Theano while the other one does not, but | 121 is mostly that one relies on Theano while the other one does not, but |
103 hopefully a non-Theano one can be created by simply wrapping around the | 122 hopefully a non-Theano one can be created by simply wrapping around the |
104 Theano one. | 123 Theano one. |
105 | 124 |
125 JB replies: Right, it would make more sense to distinguish them by the fact that | |
126 one works on Theano objects, and the other on general Python callable functions. | |
127 There is room for an iterative numpy interface, but I didn't make it yet. Would | |
128 that answer your question? | |
129 | |
130 | |
131 | |
132 Examples | |
133 -------- | |
134 | |
135 | |
136 Simple stochastic gradient descent with extra updates: | |
137 | |
138 sgd([p], gradients=[g], updates={a:b}, step_size=.1) will return {a:b, p:p-.1*g} |