annotate doc/v2_planning/optimization.txt @ 1064:a41cc29cee26

v2planning optimization - API draft
author James Bergstra <bergstrj@iro.umontreal.ca>
date Thu, 09 Sep 2010 17:44:43 -0400
parents baf1988db557
children 9fe0f0755b03
rev   line source
1064
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
1 =========================
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
2 Optimization for Learning
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
3 =========================
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
4
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
5 Members: Bergstra, Lamblin, Dellaleau, Glorot, Breuleux, Bordes
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
6 Leader: Bergstra
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
7
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
8
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
9
a41cc29cee26 v2planning optimization - API draft
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1057
diff changeset
10 Initial Writeup by James
1009
dc5185cca21e Added files for Coding Style and Optimization committees
Olivier Delalleau <delallea@iro>
parents:
diff changeset
11 =========================================
dc5185cca21e Added files for Coding Style and Optimization committees
Olivier Delalleau <delallea@iro>
parents:
diff changeset
12
1013
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
13
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
14
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
15 Previous work - scikits, openopt, scipy provide function optimization
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
16 algorithms. These are not currently GPU-enabled but may be in the future.
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
17
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
18
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
19 IS PREVIOUS WORK SUFFICIENT?
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
20 --------------------------------
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
21
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
22 In many cases it is (I used it for sparse coding, and it was ok).
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
23
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
24 These packages provide batch optimization, whereas we typically need online
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
25 optimization.
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
26
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
27 It can be faster (to run) and more convenient (to implement) to have
1016
618b9fdbfda5 optimization: Minor typo fixes
Olivier Delalleau <delallea@iro>
parents: 1013
diff changeset
28 optimization algorithms as Theano update expressions.
1013
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
29
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
30
1016
618b9fdbfda5 optimization: Minor typo fixes
Olivier Delalleau <delallea@iro>
parents: 1013
diff changeset
31 What optimization algorithms do we want/need?
1013
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
32 ---------------------------------------------
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
33
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
34 - sgd
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
35 - sgd + momentum
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
36 - sgd with annealing schedule
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
37 - TONGA
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
38 - James Marten's Hessian-free
1027
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
39 - Conjugate gradients, batch and (large) mini-batch [that is also what Marten's thing does]
1013
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
40
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
41 Do we need anything to make batch algos work better with Pylearn things?
1027
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
42 - conjugate methods? yes
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
43 - L-BFGS? maybe, when needed
1013
5e9a3d9bc0b4 optimization - added some text
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1009
diff changeset
44
1027
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
45
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
46
a1b6ccd5b6dc few comments added
Yoshua Bengio <bengioy@iro.umontreal.ca>
parents: 1016
diff changeset
47
1057
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
48
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
49 Proposal for API
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
50 ================
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
51
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
52 Stick to the same style of API that we've used for SGD so far. I think it has
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
53 worked well. It takes theano expressions as inputs and returns theano
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
54 expressions as results. The caller is responsible for building those
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
55 expressions into a callable function that does the minimization (and other
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
56 things too maybe).
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
57
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
58
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
59 def stochastic_gradientbased_optimization_updates(parameters, cost=None, grads=None, **kwargs):
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
60 """
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
61 :param parameters: list or tuple of Theano variables (typically shared vars)
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
62 that we want to optimize iteratively algorithm.
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
63
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
64 :param cost: scalar-valued Theano variable that computes noisy estimate of
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
65 cost (what are the conditions on the noise?). The cost is ignored if
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
66 grads are given.
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
67
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
68 :param grads: list or tuple of Theano variables representing the gradients on
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
69 the corresponding parameters. These default to tensor.grad(cost,
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
70 parameters).
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
71
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
72 :param kwargs: algorithm-dependent arguments
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
73
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
74 :returns: a list of pairs (v, new_v) that indicate the value (new_v) each
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
75 variable (v) should take in order to carry out the optimization procedure.
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
76
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
77 The first section of the return value list corresponds to the terms in
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
78 `parameters`, and the optimization algorithm can return additional update
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
79 expression afterward. This list of pairs can be passed directly to the
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
80 dict() constructor to create a dictionary such that dct[v] == new_v.
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
81 """
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
82
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
83
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
84 Why not a class interface with an __init__ that takes the kwargs, and an
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
85 updates() that returns the updates? It would be wrong for auxiliary shared
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
86 variables to be involved in two updates, so the interface should not encourage
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
87 separate methods for those two steps.
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
88
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
89
baf1988db557 v2planning optimization - added API
James Bergstra <bergstrj@iro.umontreal.ca>
parents: 1036
diff changeset
90