Mercurial > pylearn
annotate sparse_random_autoassociator/main.py @ 370:a1bbcde6b456
Moved sparse_random_autoassociator from my repository
author | Joseph Turian <turian@gmail.com> |
---|---|
date | Mon, 07 Jul 2008 01:54:46 -0400 |
parents | |
children | 22463a194c90 |
rev | line source |
---|---|
370
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
1 #!/usr/bin/python |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
2 """ |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
3 An autoassociator for sparse inputs, using Ronan Collobert + Jason |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
4 Weston's sampling trick (2008). |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
5 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
6 The learned model is:: |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
7 h = sigmoid(dot(x, w1) + b1) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
8 y = sigmoid(dot(h, w2) + b2) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
9 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
10 We assume that most of the inputs are zero, and hence that we can |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
11 separate x into xnonzero, x's nonzero components, and a xzero, |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
12 a sample of the zeros. (We randomly without replacement choose |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
13 ZERO_SAMPLE_SIZE zero columns.) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
14 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
15 The desideratum is that every nonzero entry is separated from every |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
16 zero entry by margin at least MARGIN. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
17 For each ynonzero, we want it to exceed max(yzero) by at least MARGIN. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
18 For each yzero, we want it to be exceed by min(ynonzero) by at least MARGIN. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
19 The loss is a hinge loss (linear). The loss is irrespective of the |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
20 xnonzero magnitude (this may be a limitation). Hence, all nonzeroes |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
21 are equally important to exceed the maximum yzero. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
22 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
23 LIMITATIONS: |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
24 - Only does pure stochastic gradient (batchsize = 1). |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
25 - Loss is irrespective of the xnonzero magnitude. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
26 - We will always use all nonzero entries, even if the training |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
27 instance is very non-sparse. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
28 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
29 @bug: If there are not ZERO_SAMPLE_SIZE zeroes, we will enter an |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
30 endless loop. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
31 """ |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
32 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
33 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
34 import numpy, random |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
35 import globals |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
36 random.seed(globals.SEED) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
37 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
38 nonzero_instances = [] |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
39 nonzero_instances.append({1: 0.1, 5: 0.5, 9: 1}) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
40 nonzero_instances.append({2: 0.3, 5: 0.5, 8: 0.8}) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
41 nonzero_instances.append({1: 0.2, 2: 0.3, 5: 0.5}) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
42 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
43 import model |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
44 model = model.Model() |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
45 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
46 for i in xrange(100000): |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
47 # Select an instance |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
48 instance = nonzero_instances[i % len(nonzero_instances)] |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
49 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
50 # Get the nonzero indices |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
51 nonzero_indexes = instance.keys() |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
52 nonzero_indexes.sort() |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
53 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
54 # Get the zero indices |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
55 # @bug: If there are not ZERO_SAMPLE_SIZE zeroes, we will enter an endless loop. |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
56 zero_indexes = [] |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
57 while len(zero_indexes) < globals.ZERO_SAMPLE_SIZE: |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
58 idx = random.randint(0, globals.INPUT_DIMENSION - 1) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
59 if idx in nonzero_indexes or idx in zero_indexes: continue |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
60 zero_indexes.append(idx) |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
61 zero_indexes.sort() |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
62 |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
63 # SGD update over instance |
a1bbcde6b456
Moved sparse_random_autoassociator from my repository
Joseph Turian <turian@gmail.com>
parents:
diff
changeset
|
64 model.update(instance, nonzero_indexes, zero_indexes) |