view doc/v2_planning/sampler.txt @ 1083:4c00af69c164

dataset: Asking what we want from mini-batches
author Olivier Delalleau <delallea@iro>
date Fri, 10 Sep 2010 16:31:43 -0400
parents 875d53754bd0
children 0e12ea6ba661
line wrap: on
line source


Inference / Sampling committee: JB, GD, AC

OVERVIEW
========

Before we start defining what a sampler is and how it should be defined in
pylearn, we should first know what we're up against. 

The workflow I have in mind is the following:
1. identify the most popular sampling algorithms in the litterature
2. get up to speed with methods we're not familiar with
3. identify common usage patterns, properties of the algorithm, etc.
4. decide on an API / best way to implement them
5. prioritize the algorithms
6. code away

1.BACKGROUND
=============

This section should provide a brief overview of what exists in the litterature.
We should make sure to have a decent understanding of all of these (not everyone
has to be experts though), so that we can *intelligently* design our sampler
interface based on common usage patterns, properties, etc.

Sampling from basic distributions
* already supported: uniform, normal, binomial, multinomial
* wish list: beta, poisson, others ?

List of sampling algorithms:
* inversion sampling
* rejection sampling
* importance sampling
* Markov Chain Monte Carlo 
* Gibbs sampling
* Metropolis Hastings
* Slice Sampling
* Annealing
* Parallel Tempering, Tempered Transitions, Simulated Tempering
* Nested Sampling (?)
* Hamiltonian Monte Carlo --> or is it Hybrid Monte Carlo?

3. USAGE PATTERNS
=================

* MCMC methods have a usage pattern that is quite different from the kind of univariate sampling methods
needed for nice-and-easy parametric families.