diff doc/v2_planning/formulas.txt @ 1051:bc246542d6ff

added file for the formulas commitee.
author Frederic Bastien <nouiz@nouiz.org>
date Wed, 08 Sep 2010 15:39:51 -0400
parents
children 42ddbefd1e03
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/v2_planning/formulas.txt	Wed Sep 08 15:39:51 2010 -0400
@@ -0,0 +1,89 @@
+Math formulas
+=============
+
+Participants
+------------
+- Fred*
+- Razvan
+- Aaron
+- Olivier B.
+- Nicolas
+
+TODO 
+----
+* define a list of search tag to start with
+* propose an interface(many inputs, outputs, doc style, hierrache, to search, html output?)
+* find existing repositories with files for formulas.
+* move existing formulas to pylearn as examples and add other basics ones.
+** theano.tensor.nnet will probably be copied to pylearn.formulas.nnet and depricated.
+
+Why we need formulas
+--------------------
+
+Their is a few reasons why having a library of mathematical formula for theano is a good reason:
+
+* Some formula have some special thing needed for the gpu. 
+   * Sometimes we need to cast to floatX...
+* Some formula have numerical stability problem.
+* Some formula gradiant have numerical stability problem. (Happen more frequently then the previous ones)
+   * If theano don't always do some stability optimization, we could do it manually in the formulas
+* Some formula as complex to implement and take many try to do correctly. 
+
+Having a library help in that we solve those problem only once.
+
+Formulas definition
+-------------------
+
+We define formulas as something that don't have a state. They are implemented as python function 
+that take theano variable as input and output theano variable. If you want state, look at what the 
+learner commity will do.
+
+Formulas doc must have
+----------------------
+
+* A latex mathematical description of the formulas(for picture representation in generated documentation)
+* Tags(for searching):
+   * a list of lower lovel fct used
+   * category(name of the submodule itself)
+* Tell if we did some work to make it more numerical stable. Do theano do the optimization needed?
+* Tell if the grad is numericaly stable? Do theano do the optimization needed?
+* Tell if work on gpu/not/unknow
+* Tell alternate name
+* Tell the domaine, range of the input/output(range should use the english notation of including or excluding)
+
+List of existing repos
+----------------------
+
+Olivier B. ?
+Xavier G.: git@github.com:glorotxa/DeepANN.git, see file deepANN/{Activations.py(to nnet),Noise.py,Reconstruction_cost.py(to costs),Regularization.py(to regularization}
+
+Proposed hierarchy
+------------------
+
+Here is the proposed hierarchy for formulas
+
+pylearn.formulas.costs: generic / common cost functions, e.g. various cross-entropies, squared error, 
+abs. error, various sparsity penalties (L1, Student)
+
+pylearn.formulas.regularization: formulas for regularization
+
+pylearn.formulas.linear: formulas for linear classifier, linear regression, factor analysis, PCA
+
+pylearn.formulas.nnet: formulas for building layers of various kinds, various activation functions,
+layers which could be plugged with various costs & penalties, and stacked
+
+pylearn.formulas.ae: formulas for auto-encoders and denoising auto-encoder variants
+
+pylearn.formulas.noise: formulas for corruption processes
+
+pylearn.formulas.rbm: energies, free energies, conditional distributions, Gibbs sampling
+
+pylearn.formulas.trees: formulas for decision trees
+
+pylearn.formulas.boosting: formulas for boosting variants
+
+pylearn.formulas.maths for other math formulas
+
+pylearn.formulas.scipy.stats: example to implement the same interface as existing lib
+
+etc.