Mercurial > pylearn
view doc/v2_planning/optimization.txt @ 1153:ae5ba6206fd3
a first draft of pseudo-code for logreg .. using version B (?) approach
author | Razvan Pascanu <r.pascanu@gmail.com> |
---|---|
date | Thu, 16 Sep 2010 17:34:30 -0400 |
parents | 7c5dc11c850a |
children | f2105a06201c |
line wrap: on
line source
========================= Optimization for Learning ========================= Members: Bergstra, Lamblin, Delalleau, Glorot, Breuleux, Bordes Leader: Bergstra Initial Writeup by James ========================================= Previous work - scikits, openopt, scipy provide function optimization algorithms. These are not currently GPU-enabled but may be in the future. IS PREVIOUS WORK SUFFICIENT? -------------------------------- In many cases it is (I used it for sparse coding, and it was ok). These packages provide batch optimization, whereas we typically need online optimization. It can be faster (to run) and more convenient (to implement) to have optimization algorithms as Theano update expressions. What optimization algorithms do we want/need? --------------------------------------------- - sgd - sgd + momentum - sgd with annealing schedule - TONGA - James Marten's Hessian-free - Conjugate gradients, batch and (large) mini-batch [that is also what Marten's thing does] Do we need anything to make batch algos work better with Pylearn things? - conjugate methods? yes - L-BFGS? maybe, when needed Proposal for API ================ See api_optimization.txt. OD asks: Do we really need a different file? If yes, maybe create a subdirectory to be able to easily find all files related to optimization? JB replies: Yoshua's orders. OD asks: Could it be more convenient for x0 to be a list? JB replies: Yes, but that's not the interface used by other minimize() routines (e.g. in scipy). Maybe another list-based interface is required? OD replies: I think most people would prefer to use a list-based interface, so they don't have to manually pack / unpack multiple arrrays of parameters. So I would vote in favor or having both (where the main reason to also provide a non-list interface would be to allow one to easily switch e.g. to scipy's minimize). I would guess the reason scipy's interface is like this is because it makes it easier for the optimization algorithm. However, this does not really matter if we are just wrapping a theano-based algorithm (that already has to handle multiple parameters), and avoiding useless data copies on each call to f / df can only help speed-wise. JB replies: Done, I added possibility that x0 is list of ndarrays to the api doc. OD asks: Why make a difference between iterative and one-shot versions? A one-shot algorithm can be seen as an iterative one that stops after its first iteration. The difference I see between the two interfaces proposed here is mostly that one relies on Theano while the other one does not, but hopefully a non-Theano one can be created by simply wrapping around the Theano one. JB replies: Right, it would make more sense to distinguish them by the fact that one works on Theano objects, and the other on general Python callable functions. There is room for an iterative numpy interface, but I didn't make it yet. Would that answer your question? OD replies and asks: Partly. Do we really need a non-iterative interface?