Mercurial > pylearn
annotate doc/v2_planning/optimization.txt @ 1064:a41cc29cee26
v2planning optimization - API draft
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Thu, 09 Sep 2010 17:44:43 -0400 |
parents | baf1988db557 |
children | 9fe0f0755b03 |
rev | line source |
---|---|
1064
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
1 ========================= |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
2 Optimization for Learning |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
3 ========================= |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
4 |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
5 Members: Bergstra, Lamblin, Dellaleau, Glorot, Breuleux, Bordes |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
6 Leader: Bergstra |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
7 |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
8 |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
9 |
a41cc29cee26
v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1057
diff
changeset
|
10 Initial Writeup by James |
1009
dc5185cca21e
Added files for Coding Style and Optimization committees
Olivier Delalleau <delallea@iro>
parents:
diff
changeset
|
11 ========================================= |
dc5185cca21e
Added files for Coding Style and Optimization committees
Olivier Delalleau <delallea@iro>
parents:
diff
changeset
|
12 |
1013
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
13 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
14 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
15 Previous work - scikits, openopt, scipy provide function optimization |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
16 algorithms. These are not currently GPU-enabled but may be in the future. |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
17 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
18 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
19 IS PREVIOUS WORK SUFFICIENT? |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
20 -------------------------------- |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
21 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
22 In many cases it is (I used it for sparse coding, and it was ok). |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
23 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
24 These packages provide batch optimization, whereas we typically need online |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
25 optimization. |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
26 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
27 It can be faster (to run) and more convenient (to implement) to have |
1016
618b9fdbfda5
optimization: Minor typo fixes
Olivier Delalleau <delallea@iro>
parents:
1013
diff
changeset
|
28 optimization algorithms as Theano update expressions. |
1013
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
29 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
30 |
1016
618b9fdbfda5
optimization: Minor typo fixes
Olivier Delalleau <delallea@iro>
parents:
1013
diff
changeset
|
31 What optimization algorithms do we want/need? |
1013
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
32 --------------------------------------------- |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
33 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
34 - sgd |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
35 - sgd + momentum |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
36 - sgd with annealing schedule |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
37 - TONGA |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
38 - James Marten's Hessian-free |
1027
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
39 - Conjugate gradients, batch and (large) mini-batch [that is also what Marten's thing does] |
1013
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
40 |
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
41 Do we need anything to make batch algos work better with Pylearn things? |
1027
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
42 - conjugate methods? yes |
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
43 - L-BFGS? maybe, when needed |
1013
5e9a3d9bc0b4
optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1009
diff
changeset
|
44 |
1027
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
45 |
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
46 |
a1b6ccd5b6dc
few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents:
1016
diff
changeset
|
47 |
1057
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
48 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
49 Proposal for API |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
50 ================ |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
51 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
52 Stick to the same style of API that we've used for SGD so far. I think it has |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
53 worked well. It takes theano expressions as inputs and returns theano |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
54 expressions as results. The caller is responsible for building those |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
55 expressions into a callable function that does the minimization (and other |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
56 things too maybe). |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
57 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
58 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
59 def stochastic_gradientbased_optimization_updates(parameters, cost=None, grads=None, **kwargs): |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
60 """ |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
61 :param parameters: list or tuple of Theano variables (typically shared vars) |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
62 that we want to optimize iteratively algorithm. |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
63 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
64 :param cost: scalar-valued Theano variable that computes noisy estimate of |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
65 cost (what are the conditions on the noise?). The cost is ignored if |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
66 grads are given. |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
67 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
68 :param grads: list or tuple of Theano variables representing the gradients on |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
69 the corresponding parameters. These default to tensor.grad(cost, |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
70 parameters). |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
71 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
72 :param kwargs: algorithm-dependent arguments |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
73 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
74 :returns: a list of pairs (v, new_v) that indicate the value (new_v) each |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
75 variable (v) should take in order to carry out the optimization procedure. |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
76 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
77 The first section of the return value list corresponds to the terms in |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
78 `parameters`, and the optimization algorithm can return additional update |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
79 expression afterward. This list of pairs can be passed directly to the |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
80 dict() constructor to create a dictionary such that dct[v] == new_v. |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
81 """ |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
82 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
83 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
84 Why not a class interface with an __init__ that takes the kwargs, and an |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
85 updates() that returns the updates? It would be wrong for auxiliary shared |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
86 variables to be involved in two updates, so the interface should not encourage |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
87 separate methods for those two steps. |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
88 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
89 |
baf1988db557
v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1036
diff
changeset
|
90 |