Mercurial > pylearn
diff doc/v2_planning/API_formulas.txt @ 1174:fe6c25eb1e37
merge
author | pascanur |
---|---|
date | Fri, 17 Sep 2010 16:13:58 -0400 |
parents | 42ddbefd1e03 |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/v2_planning/API_formulas.txt Fri Sep 17 16:13:58 2010 -0400 @@ -0,0 +1,96 @@ +.. _v2planning_formulas: + +Math formulas API +================= + +Why we need a formulas API +-------------------------- + +Their is a few reasons why having a library of mathematical formula for theano is a good reason: + +* Some formula have some special thing needed for the gpu. + * Sometimes we need to cast to floatX... +* Some formula have numerical stability problem. +* Some formula gradiant have numerical stability problem. (Happen more frequently then the previous ones) + * If theano don't always do some stability optimization, we could do it manually in the formulas +* Some formula as complex to implement and take many try to do correctly. +* Can mimic the hierarchy of other library to ease the migration to theano + +Having a library help in that we solve those problem only once. + +What is a formula +----------------- + +We define formulas as something that don't have a state. They are implemented as +python function that take theano variable as input and they output theano +variable. If you want state, look at what the others commities will do. + +Formulas documentation +---------------------- + +We must respect what the coding commitee have set for the docstring of the file and of the function. + +* A latex mathematical description of the formulas(for picture representation in generated documentation) +* Tags(for searching): + * a list of lower level fct used + * category(name of the submodule itself) +* Tell if we did some work to make it more numerical stable. Do theano do the optimization needed? +* Tell if the grad is numericaly stable? Do theano do the optimization needed? +* Tell if work/don't/unknow on gpu. +* Tell alternate name +* Tell the domaine, range of the input/output(range should use the english notation of including or excluding) + +Proposed hierarchy +------------------ + +Here is the proposed hierarchy for formulas: + +* pylearn.formulas.costs: generic / common cost functions, e.g. various cross-entropies, squared error, + abs. error, various sparsity penalties (L1, Student) +* pylearn.formulas.regularization: formulas for regularization +* pylearn.formulas.linear: formulas for linear classifier, linear regression, factor analysis, PCA +* pylearn.formulas.nnet: formulas for building layers of various kinds, various activation functions, + layers which could be plugged with various costs & penalties, and stacked +* pylearn.formulas.ae: formulas for auto-encoders and denoising auto-encoder variants +* pylearn.formulas.noise: formulas for corruption processes +* pylearn.formulas.rbm: energies, free energies, conditional distributions, Gibbs sampling +* pylearn.formulas.trees: formulas for decision trees +* pylearn.formulas.boosting: formulas for boosting variants +* pylearn.formulas.maths for other math formulas +* pylearn.formulas.scipy.stats: example to implement the same interface as existing lib + +etc. + +Example +------- +.. code-block:: python + + """ + This script defines a few often used cost functions. + """ + import theano + import theano.tensor as T + from tags import tags + + @tags('cost','binary','cross-entropy') + def binary_crossentropy(output, target): + """ Compute the crossentropy of binary output wrt binary target. + + .. math:: + L_{CE} \equiv t\log(o) + (1-t)\log(1-o) + + :type output: Theano variable + :param output: Binary output or prediction :math:`\in[0,1]` + :type target: Theano variable + :param target: Binary target usually :math:`\in\{0,1\}` + """ + return -(target * tensor.log(output) + (1.0 - target) * tensor.log(1.0 - output)) + + +TODO +---- +* define a list of search tag to start with +* Add to the html page a list of the tag and a list of each fct associated to them. +* move existing formulas to pylearn as examples and add other basics ones. +* theano.tensor.nnet will probably be copied to pylearn.formulas.nnet and depricated. +