Mercurial > pylearn
diff sandbox/sparse_random_autoassociator/main.py @ 393:36baeb7125a4
Made sandbox directory
author | Joseph Turian <turian@gmail.com> |
---|---|
date | Tue, 08 Jul 2008 18:46:26 -0400 |
parents | sparse_random_autoassociator/main.py@e4473d9697d7 |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/sandbox/sparse_random_autoassociator/main.py Tue Jul 08 18:46:26 2008 -0400 @@ -0,0 +1,48 @@ +#!/usr/bin/python +""" + An autoassociator for sparse inputs, using Ronan Collobert + Jason + Weston's sampling trick (2008). + + The learned model is:: + h = sigmoid(dot(x, w1) + b1) + y = sigmoid(dot(h, w2) + b2) + + We assume that most of the inputs are zero, and hence that + we can separate x into xnonzero, x's nonzero components, and + xzero, a sample of the zeros. We sample---randomly without + replacement---ZERO_SAMPLE_SIZE zero columns from x. + + The desideratum is that every nonzero entry is separated from every + zero entry by margin at least MARGIN. + For each ynonzero, we want it to exceed max(yzero) by at least MARGIN. + For each yzero, we want it to be exceed by min(ynonzero) by at least MARGIN. + The loss is a hinge loss (linear). The loss is irrespective of the + xnonzero magnitude (this may be a limitation). Hence, all nonzeroes + are equally important to exceed the maximum yzero. + + (Alternately, there is a commented out binary xent loss.) + + LIMITATIONS: + - Only does pure stochastic gradient (batchsize = 1). + - Loss is irrespective of the xnonzero magnitude. + - We will always use all nonzero entries, even if the training + instance is very non-sparse. +""" + + +import numpy + +nonzero_instances = [] +nonzero_instances.append({1: 0.1, 5: 0.5, 9: 1}) +nonzero_instances.append({2: 0.3, 5: 0.5, 8: 0.8}) +nonzero_instances.append({1: 0.2, 2: 0.3, 5: 0.5}) + +import model +model = model.Model() + +for i in xrange(100000): + # Select an instance + instance = nonzero_instances[i % len(nonzero_instances)] + + # SGD update over instance + model.update(instance)