comparison sandbox/sparse_random_autoassociator/main.py @ 393:36baeb7125a4

Made sandbox directory
author Joseph Turian <turian@gmail.com>
date Tue, 08 Jul 2008 18:46:26 -0400
parents sparse_random_autoassociator/main.py@e4473d9697d7
children
comparison
equal deleted inserted replaced
392:e2cb8d489908 393:36baeb7125a4
1 #!/usr/bin/python
2 """
3 An autoassociator for sparse inputs, using Ronan Collobert + Jason
4 Weston's sampling trick (2008).
5
6 The learned model is::
7 h = sigmoid(dot(x, w1) + b1)
8 y = sigmoid(dot(h, w2) + b2)
9
10 We assume that most of the inputs are zero, and hence that
11 we can separate x into xnonzero, x's nonzero components, and
12 xzero, a sample of the zeros. We sample---randomly without
13 replacement---ZERO_SAMPLE_SIZE zero columns from x.
14
15 The desideratum is that every nonzero entry is separated from every
16 zero entry by margin at least MARGIN.
17 For each ynonzero, we want it to exceed max(yzero) by at least MARGIN.
18 For each yzero, we want it to be exceed by min(ynonzero) by at least MARGIN.
19 The loss is a hinge loss (linear). The loss is irrespective of the
20 xnonzero magnitude (this may be a limitation). Hence, all nonzeroes
21 are equally important to exceed the maximum yzero.
22
23 (Alternately, there is a commented out binary xent loss.)
24
25 LIMITATIONS:
26 - Only does pure stochastic gradient (batchsize = 1).
27 - Loss is irrespective of the xnonzero magnitude.
28 - We will always use all nonzero entries, even if the training
29 instance is very non-sparse.
30 """
31
32
33 import numpy
34
35 nonzero_instances = []
36 nonzero_instances.append({1: 0.1, 5: 0.5, 9: 1})
37 nonzero_instances.append({2: 0.3, 5: 0.5, 8: 0.8})
38 nonzero_instances.append({1: 0.2, 2: 0.3, 5: 0.5})
39
40 import model
41 model = model.Model()
42
43 for i in xrange(100000):
44 # Select an instance
45 instance = nonzero_instances[i % len(nonzero_instances)]
46
47 # SGD update over instance
48 model.update(instance)