Mercurial > pylearn

--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/v2_planning/API_formulas.txt	Fri Sep 17 13:57:07 2010 -0400
@@ -0,0 +1,96 @@
+.. _v2planning_formulas:
+
+Math formulas API
+=================
+
+Why we need a formulas API
+--------------------------
+
+Their is a few reasons why having a library of mathematical formula for theano is a good reason:
+
+* Some formula have some special thing needed for the gpu.
+   * Sometimes we need to cast to floatX...
+* Some formula have numerical stability problem.
+* Some formula gradiant have numerical stability problem. (Happen more frequently then the previous ones)
+   * If theano don't always do some stability optimization, we could do it manually in the formulas
+* Some formula as complex to implement and take many try to do correctly.
+* Can mimic the hierarchy of other library to ease the migration to theano
+
+Having a library help in that we solve those problem only once.
+
+What is a formula
+-----------------
+
+We define formulas as something that don't have a state. They are implemented as
+python function that take theano variable as input and they output theano
+variable. If you want state, look at what the others commities will do.
+
+Formulas documentation
+----------------------
+
+We must respect what the coding commitee have set for the docstring of the file and of the function.
+
+* A latex mathematical description of the formulas(for picture representation in generated documentation)
+* Tags(for searching):
+   * a list of lower level fct used
+   * category(name of the submodule itself)
+* Tell if we did some work to make it more numerical stable. Do theano do the optimization needed?
+* Tell if the grad is numericaly stable? Do theano do the optimization needed?
+* Tell if work/don't/unknow on gpu.
+* Tell alternate name
+* Tell the domaine, range of the input/output(range should use the english notation of including or excluding)
+
+Proposed hierarchy
+------------------
+
+Here is the proposed hierarchy for formulas:
+
+* pylearn.formulas.costs: generic / common cost functions, e.g. various cross-entropies, squared error,
+  abs. error, various sparsity penalties (L1, Student)
+* pylearn.formulas.regularization: formulas for regularization
+* pylearn.formulas.linear: formulas for linear classifier, linear regression, factor analysis, PCA
+* pylearn.formulas.nnet: formulas for building layers of various kinds, various activation functions,
+  layers which could be plugged with various costs & penalties, and stacked
+* pylearn.formulas.ae: formulas for auto-encoders and denoising auto-encoder variants
+* pylearn.formulas.noise: formulas for corruption processes
+* pylearn.formulas.rbm: energies, free energies, conditional distributions, Gibbs sampling
+* pylearn.formulas.trees: formulas for decision trees
+* pylearn.formulas.boosting: formulas for boosting variants
+* pylearn.formulas.maths for other math formulas
+* pylearn.formulas.scipy.stats: example to implement the same interface as existing lib
+
+etc.
+
+Example
+-------
+.. code-block:: python
+
+        """
+        This script defines a few often used cost functions.
+        """
+        import theano
+        import theano.tensor as T
+        from tags import tags
+
+        @tags('cost','binary','cross-entropy')
+        def binary_crossentropy(output, target):
+            """ Compute the crossentropy of binary output wrt binary target.
+
+            .. math::
+                L_{CE} \equiv t\log(o) + (1-t)\log(1-o)
+
+            :type output: Theano variable
+            :param output: Binary output or prediction :math:`\in[0,1]`
+            :type target: Theano variable
+            :param target: Binary target usually :math:`\in\{0,1\}`
+            """
+            return -(target * tensor.log(output) + (1.0 - target) * tensor.log(1.0 - output))
+
+
+TODO
+----
+* define a list of search tag to start with
+* Add to the html page a list of the tag and a list of each fct associated to them.
+* move existing formulas to pylearn as examples and add other basics ones.
+* theano.tensor.nnet will probably be copied to pylearn.formulas.nnet and depricated.
+
--- a/doc/v2_planning/formulas.txt	Fri Sep 17 13:56:22 2010 -0400
+++ b/doc/v2_planning/formulas.txt	Fri Sep 17 13:57:07 2010 -0400
@@ -9,47 +9,6 @@
 - Olivier B.
 - Nicolas

-TODO
-----
-* define a list of search tag to start with
-* propose an interface(many inputs, outputs, doc style, hierrache, to search, html output?)
-* find existing repositories with files for formulas.
-* move existing formulas to pylearn as examples and add other basics ones.
-** theano.tensor.nnet will probably be copied to pylearn.formulas.nnet and depricated.
-
-Why we need formulas
---------------------
-
-Their is a few reasons why having a library of mathematical formula for theano is a good reason:
-
-* Some formula have some special thing needed for the gpu.
-   * Sometimes we need to cast to floatX...
-* Some formula have numerical stability problem.
-* Some formula gradiant have numerical stability problem. (Happen more frequently then the previous ones)
-   * If theano don't always do some stability optimization, we could do it manually in the formulas
-* Some formula as complex to implement and take many try to do correctly.
-
-Having a library help in that we solve those problem only once.
-
-Formulas definition
--------------------
-
-We define formulas as something that don't have a state. They are implemented as python function
-that take theano variable as input and output theano variable. If you want state, look at what the
-learner commity will do.
-
-Formulas doc must have
-----------------------
-
-* A latex mathematical description of the formulas(for picture representation in generated documentation)
-* Tags(for searching):
-   * a list of lower lovel fct used
-   * category(name of the submodule itself)
-* Tell if we did some work to make it more numerical stable. Do theano do the optimization needed?
-* Tell if the grad is numericaly stable? Do theano do the optimization needed?
-* Tell if work on gpu/not/unknow
-* Tell alternate name
-* Tell the domaine, range of the input/output(range should use the english notation of including or excluding)

 List of existing repos
 ----------------------
@@ -57,33 +16,3 @@
 Olivier B. ?
 Xavier G.: git@github.com:glorotxa/DeepANN.git, see file deepANN/{Activations.py(to nnet),Noise.py,Reconstruction_cost.py(to costs),Regularization.py(to regularization}

-Proposed hierarchy
-------------------
-
-Here is the proposed hierarchy for formulas
-
-pylearn.formulas.costs: generic / common cost functions, e.g. various cross-entropies, squared error,
-abs. error, various sparsity penalties (L1, Student)
-
-pylearn.formulas.regularization: formulas for regularization
-
-pylearn.formulas.linear: formulas for linear classifier, linear regression, factor analysis, PCA
-
-pylearn.formulas.nnet: formulas for building layers of various kinds, various activation functions,
-layers which could be plugged with various costs & penalties, and stacked
-
-pylearn.formulas.ae: formulas for auto-encoders and denoising auto-encoder variants
-
-pylearn.formulas.noise: formulas for corruption processes
-
-pylearn.formulas.rbm: energies, free energies, conditional distributions, Gibbs sampling
-
-pylearn.formulas.trees: formulas for decision trees
-
-pylearn.formulas.boosting: formulas for boosting variants
-
-pylearn.formulas.maths for other math formulas
-
-pylearn.formulas.scipy.stats: example to implement the same interface as existing lib
-
-etc.