view doc/v2_planning/sampler.txt @ 1031:480cc8ac2032

committees: Marked committees leaders with a *
author Olivier Delalleau <delallea@iro>
date Mon, 06 Sep 2010 22:06:13 -0400
parents a1b6ccd5b6dc
children 875d53754bd0
line wrap: on
line source


Inference / Sampling committee: JB, GD, AC

OVERVIEW
========

Before we start defining what a sampler is and how it should be defined in
pylearn, we should first know what we're up against. 

The workflow I have in mind is the following:
1. identify the most popular sampling algorithms in the litterature
2. get up to speed with methods we're not familiar with
3. identify common usage patterns, properties of the algorithm, etc.
4. decide on an API / best way to implement them
5. prioritize the algorithms
6. code away

1.BACKGROUND
=============

This section should provide a brief overview of what exists in the litterature.
We should make sure to have a decent understanding of all of these (not everyone
has to be experts though), so that we can *intelligently* design our sampler
interface based on common usage patterns, properties, etc.

Sampling from basic distributions
* already supported: uniform, normal, binomial, multinomial
* wish list: beta, poisson, others ?

List of sampling algorithms:

* inversion sampling
* rejection sampling
* importance sampling
* Markov Chain Monte Carlo 
* Gibbs sampling
* Metropolis Hastings
* Slice Sampling
* Annealing
* Parallel Tempering, Tempered Transitions, Simulated Tempering
* Nested Sampling (?)
* Hamiltonian Monte Carlo --> or is it Hybrid Monte Carlo?

3. USAGE PATTERNS
=================

* MCMC methods have a usage pattern that is quite different from the kind of univariate sampling methods
needed for nice-and-easy parametric families.