Mercurial > pylearn
view sandbox/sparse_random_autoassociator/main.py @ 416:8849eba55520
Can now do minibatch update
author | Joseph Turian <turian@iro.umontreal.ca> |
---|---|
date | Fri, 11 Jul 2008 16:34:46 -0400 |
parents | 36baeb7125a4 |
children |
line wrap: on
line source
#!/usr/bin/python """ An autoassociator for sparse inputs, using Ronan Collobert + Jason Weston's sampling trick (2008). The learned model is:: h = sigmoid(dot(x, w1) + b1) y = sigmoid(dot(h, w2) + b2) We assume that most of the inputs are zero, and hence that we can separate x into xnonzero, x's nonzero components, and xzero, a sample of the zeros. We sample---randomly without replacement---ZERO_SAMPLE_SIZE zero columns from x. The desideratum is that every nonzero entry is separated from every zero entry by margin at least MARGIN. For each ynonzero, we want it to exceed max(yzero) by at least MARGIN. For each yzero, we want it to be exceed by min(ynonzero) by at least MARGIN. The loss is a hinge loss (linear). The loss is irrespective of the xnonzero magnitude (this may be a limitation). Hence, all nonzeroes are equally important to exceed the maximum yzero. (Alternately, there is a commented out binary xent loss.) LIMITATIONS: - Only does pure stochastic gradient (batchsize = 1). - Loss is irrespective of the xnonzero magnitude. - We will always use all nonzero entries, even if the training instance is very non-sparse. """ import numpy nonzero_instances = [] nonzero_instances.append({1: 0.1, 5: 0.5, 9: 1}) nonzero_instances.append({2: 0.3, 5: 0.5, 8: 0.8}) nonzero_instances.append({1: 0.2, 2: 0.3, 5: 0.5}) import model model = model.Model() for i in xrange(100000): # Select an instance instance = nonzero_instances[i % len(nonzero_instances)] # SGD update over instance model.update(instance)