view doc/v2_planning/sampler.txt @ 1517:a6e634b83d88

allow to read filetensor compressed with bz2
author Frederic Bastien <nouiz@nouiz.org>
date Wed, 09 May 2012 11:56:28 -0400
parents 0e12ea6ba661
children
line wrap: on
line source


Inference / Sampling committee: JB, GD, AC

OVERVIEW
========

Before we start defining what a sampler is and how it should be defined in
pylearn, we should first know what we're up against. 

The workflow I have in mind is the following:
1. identify the most popular sampling algorithms in the litterature
2. get up to speed with methods we're not familiar with
3. identify common usage patterns, properties of the algorithm, etc.
4. decide on an API / best way to implement them
5. prioritize the algorithms
6. code away

1.BACKGROUND
=============

This section should provide a brief overview of what exists in the litterature.
We should make sure to have a decent understanding of all of these (not everyone
has to be experts though), so that we can *intelligently* design our sampler
interface based on common usage patterns, properties, etc.

Sampling from basic distributions
* already supported: uniform, normal, binomial, multinomial
* wish list: beta, poisson, others ?

List of sampling algorithms:
* inversion sampling
* rejection sampling
* importance sampling
* Markov Chain Monte Carlo 
* Gibbs sampling
* Metropolis Hastings
* Slice Sampling
* Annealing
* Parallel Tempering, Tempered Transitions, Simulated Tempering
* Nested Sampling (?)
* Hamiltonian Monte Carlo --> or is it Hybrid Monte Carlo?

3. USAGE PATTERNS
=================

* MCMC methods have a usage pattern that is quite different from the kind of univariate sampling methods
  needed for nice-and-easy parametric families.