view doc/v2_planning/API_learner.txt @ 1359:5db730bb0e8e

comments on datalearn
author James Bergstra <bergstrj@iro.umontreal.ca>
date Thu, 11 Nov 2010 17:53:13 -0500
parents 317049b21b77
children
line wrap: on
line source

.. _v2planning_learner:

Learner API
===========

A list of "task types"
----------------------

Attributes
~~~~~~~~~~

- sequential
- spatial
- structured
- semi-supervised
- missing-values


Supervised (x,y)
~~~~~~~~~~~~~~~~

- classification
- regression
- probabilistic classification
- ranking
- conditional density estimation
- collaborative filtering
- ordinal regression ?= ranking 

Unsupervised (x)
~~~~~~~~~~~~~~~~

- de-noising
- feature learning ( transformation ) PCA, DAA
- density estimation
- inference

Other
~~~~~

- generation (sampling)
- structure learning ???


Notes on metrics & statistics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   - some are applied to an example, others on a batch
   - most statistics are on the dataset


The Learner class
-----------------

.. code-block:: python

    class Learner(Object):
        '''
        Takes data as inputs, and learns a prediction function (or several).

        A learner is parametrized by hyper-parameters, which can be set from the
        outside (a "client" from Learner, that can be a HyperLearner, a
        Tester,...).

        The data can be given all at a time as a data set, or incrementally.
        Some learner need to be fully trained in one step, whereas other can be
        trained incrementally.

        The question of statistics collection during training remains open.
        '''
        #def use_dataset(dataset)

        # return a dictionary of hyperparameters names(keys)
        # and value(values)
        def get_hyper_parameters():
            ...
        def set_hyper_parameters(dictionary):
            ...


        # Ver B
        def eval(dataset):
            ...
        def predict(dataset):
            ...

        # Trainable
        def train(dataset):   # train until completion
            ...

        # Incremental
        def use_dataset(dataset):
            ...
        def adapt(n_steps=1):
            ...
        def has_converged():
            ...
        # 


Some example cases
------------------

.. code-block:: python

    class HyperLearner(Learner):

        ### def get_hyper_parameter_distribution(name)
        def set_hyper_parameters_distribution(dictionary):
            ...


    def bagging(learner_factory):
        for i in range(N):
            learner_i = learner_factory.new()
            # todo: get dataset_i ??
            learner_i.use_dataset(dataset_i)
            learner_i.train()