Mercurial > pylearn
annotate pylearn/algorithms/mcRBM.py @ 1498:0f326860210e
Merged
author | Olivier Delalleau <delallea@iro> |
---|---|
date | Thu, 01 Sep 2011 13:35:15 -0400 |
parents | 54b2268db0d7 |
children | f82b80c841b2 |
rev | line source |
---|---|
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
1 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
2 This file implements the Mean & Covariance RBM discussed in |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
3 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
4 Ranzato, M. and Hinton, G. E. (2010) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
5 Modeling pixel means and covariances using factored third-order Boltzmann machines. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
6 IEEE Conference on Computer Vision and Pattern Recognition. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
7 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
8 and performs one of the experiments on CIFAR-10 discussed in that paper. There are some minor |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
9 discrepancies between the paper and the accompanying code (train_mcRBM.py), and the |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
10 accompanying code has been taken to be correct in those cases because I couldn't get things to |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
11 work otherwise. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
12 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
13 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
14 Math |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
15 ==== |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
16 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
17 Energy of "covariance RBM" |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
18 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
19 E = -0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
20 = -0.5 \sum_f (\sum_k P_{fk} h_k) ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
21 "vector element f" "vector element f" |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
23 In some parts of the paper, the P matrix is chosen to be a diagonal matrix with non-positive |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
24 diagonal entries, so it is helpful to see this as a simpler equation: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
25 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
26 E = \sum_f h_f ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
27 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
28 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
29 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
30 Version in paper |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
31 ---------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
32 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
33 Full Energy of the Mean and Covariance RBM, with |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
34 :math:`h_k = h_k^{(c)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
35 :math:`g_j = h_j^{(m)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
36 :math:`b_k = b_k^{(c)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
37 :math:`c_j = b_j^{(m)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
38 :math:`U_{if} = C_{if}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
39 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
40 E (v, h, g) = |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
41 - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i (U_{if} v_i) / |U_{.f}|*|v| )^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
42 - \sum_k b_k h_k |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
43 + 0.5 \sum_i v_i^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
44 - \sum_j \sum_i W_{ij} g_j v_i |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
45 - \sum_j c_j g_j |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
46 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
47 For the energy function to correspond to a probability distribution, P must be non-positive. P |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
48 is initialized to be a diagonal or a topological pooling matrix, and in our experience it can |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
49 be left as such because even in the paper it has a very low learning rate, and is only allowed |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
50 to be updated after the filters in U are learned (in effect). |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
51 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
52 Version in published train_mcRBM code |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
53 ------------------------------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
54 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
55 The train_mcRBM file implements learning in a similar but technically different Energy function: |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
56 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
57 E (v, h, g) = |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
58 - 0.5 \sum_f \sum_k P_{fk} h_k (\sum_i U_{if} v_i / sqrt(\sum_i v_i^2/I + 0.5))^2 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
59 - \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
60 + 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
61 - \sum_j \sum_i W_{ij} g_j v_i |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
62 - \sum_j c_j g_j |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
63 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
64 There are two differences with respect to the paper: |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
65 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
66 - 'v' is not normalized by its length, but rather it is normalized to have length close to |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
67 the square root of the number of its components. The variable called 'small' that |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
68 "avoids division by zero" is orders larger than machine precision, and is on the order of |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
69 the normalized sum-of-squares, so I've included it in the Energy function. |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
70 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
71 - 'U' is also not normalized by its length. U is initialized to have columns that are |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
72 shorter than unit-length (approximately 0.2 with the 105 principle components in the |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
73 train_mcRBM data). During training, the columns of U are constrained manually to have |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
74 equal lengths (see the use of normVF), but Euclidean norm is allowed to change. During |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
75 learning it quickly converges towards 1 and then exceeds 1. It does not seem like this |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
76 column-wise normalization of U is justified by maximum-likelihood, I have no intuition |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
77 for why it is used. |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
78 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
79 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
80 Version in this code |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
81 -------------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
82 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
83 This file implements the same algorithm as the train_mcRBM code, except that the P matrix is |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
84 omitted for clarity, and replaced analytically with a negative identity matrix. |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
85 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
86 E (v, h, g) = |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
87 + 0.5 \sum_k h_k (\sum_i U_{ik} v_i / sqrt(\sum_i v_i^2/I + 0.5))^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
88 - \sum_k b_k h_k |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
89 + 0.5 \sum_i v_i^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
90 - \sum_j \sum_i W_{ij} g_j v_i |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
91 - \sum_j c_j g_j |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
92 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
93 E (v, h, g) = |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
94 - 0.5 \sum_f \sum_k P_{fk} h_k (\sum_i U_{if} v_i / sqrt(\sum_i v_i^2/I + 0.5))^2 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
95 - \sum_k b_k h_k |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
96 + 0.5 \sum_i v_i^2 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
97 - \sum_j \sum_i W_{ij} g_j v_i |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
98 - \sum_j c_j g_j |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
99 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
100 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
101 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
102 Conventions in this file |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
103 ======================== |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
104 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
105 This file contains some global functions, as well as a class (MeanCovRBM) that makes using them a little |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
106 more convenient. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
107 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
108 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
109 Global functions like `free_energy` work on an mcRBM as parametrized in a particular way. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
110 Suppose we have |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
111 - I input dimensions, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
112 - F squared filters, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
113 - J mean variables, and |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
114 - K covariance variables. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
115 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
116 The mcRBM is parametrized by 6 variables: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
117 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
118 - `P`, a matrix whose rows indicate covariance filter groups (F x K) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
119 - `U`, a matrix whose rows are visible covariance directions (I x F) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
120 - `W`, a matrix whose rows are visible mean directions (I x J) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
121 - `b`, a vector of hidden covariance biases (K) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
122 - `c`, a vector of hidden mean biases (J) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
123 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
124 Matrices are generally layed out and accessed according to a C-order convention. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
125 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
126 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
127 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
128 # |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
129 # WORKING NOTES |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
130 # THIS DERIVATION IS BASED ON THE ** PAPER ** ENERGY FUNCTION |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
131 # NOT THE ENERGY FUNCTION IN THE CODE!!! |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
132 # |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
133 # Free energy is the marginal energy of visible units |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
134 # Recall: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
135 # Q(x) = exp(-E(x))/Z ==> -log(Q(x)) - log(Z) = E(x) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
136 # |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
137 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
138 # E (v, h, g) = |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
139 # - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / |U_{*f}|^2 |v|^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
140 # - \sum_k b_k h_k |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
141 # + 0.5 \sum_i v_i^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
142 # - \sum_j \sum_i W_{ij} g_j v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
143 # - \sum_j c_j g_j |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
144 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
145 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
146 # |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
147 # Derivation, in which partition functions are ignored. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
148 # |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
149 # E(v) = -\log(Q(v)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
150 # = -\log( \sum_{h,g} Q(v,h,g)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
151 # = -\log( \sum_{h,g} exp(-E(v,h,g))) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
152 # = -\log( \sum_{h,g} exp(- |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
153 # - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}| * |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
154 # - \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
155 # + 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
156 # - \sum_j \sum_i W_{ij} g_j v_i |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
157 # - \sum_j c_j g_j |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
158 # - \sum_i a_i v_i )) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
159 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
160 # Get rid of double negs in exp |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
161 # = -\log( \sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
162 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}| * |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
163 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
164 # - 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
165 # ) * \sum_{g} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
166 # + \sum_j \sum_i W_{ij} g_j v_i |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
167 # + \sum_j c_j g_j)) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
168 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
169 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
170 # Break up log |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
171 # = -\log( \sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
172 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}|*|v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
173 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
174 # )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
175 # -\log( \sum_{g} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
176 # + \sum_j \sum_i W_{ij} g_j v_i |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
177 # + \sum_j c_j g_j ))) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
178 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
179 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
180 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
181 # Use domain h is binary to turn log(sum(exp(sum...))) into sum(log(.. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
182 # = -\log(\sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
183 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}|* |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
184 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
185 # )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
186 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
187 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
188 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
189 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
190 # = - \sum_{k} \log(1 + exp(b_k + 0.5 \sum_f P_{fk}( \sum_i U_{if} v_i )^2 / (|U_{*f}|*|v|))) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
191 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
192 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
193 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
194 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
195 # For negative-one-diagonal P this gives: |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
196 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
197 # = - \sum_{k} \log(1 + exp(b_k - 0.5 \sum_i (U_{ik} v_i )^2 / (|U_{*k}|*|v|))) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
198 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
199 # + 0.5 \sum_i v_i^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
200 # - \sum_i a_i v_i |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
201 |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
202 import sys, os, logging |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
203 import numpy as np |
973
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
204 import numpy |
988
fd243cb2bf0b
mcRBM - moved some things to the top of the file
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
987
diff
changeset
|
205 |
fd243cb2bf0b
mcRBM - moved some things to the top of the file
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
987
diff
changeset
|
206 import theano |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
207 from theano import function, shared, dot |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
208 from theano import tensor as TT |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
209 floatX = theano.config.floatX |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
210 |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
211 sharedX = lambda X, name : shared(numpy.asarray(X, dtype=floatX), name=name) |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
212 |
988
fd243cb2bf0b
mcRBM - moved some things to the top of the file
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
987
diff
changeset
|
213 import pylearn |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
214 from pylearn.sampling.hmc import HMC_sampler |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
215 from pylearn.io import image_tiling |
999
c6d08a760960
added sgd_updates to gd/sgd.py. Modif mcRBM to use it.
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
998
diff
changeset
|
216 from pylearn.gd.sgd import sgd_updates |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
217 import pylearn.dataset_ops.image_patches |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
218 |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
219 ########################################### |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
220 # |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
221 # Candidates for factoring |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
222 # |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
223 ########################################### |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
224 |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
225 def l1(X): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
226 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
227 :param X: TensorType variable |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
228 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
229 :rtype: TensorType scalar |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
230 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
231 :returns: the sum of absolute values of the terms in X |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
232 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
233 :math: \sum_i |X_i| |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
234 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
235 Where i is an appropriately dimensioned index. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
236 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
237 """ |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
238 return abs(X).sum() |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
239 |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
240 def l2(X): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
241 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
242 :param X: TensorType variable |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
243 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
244 :rtype: TensorType scalar |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
245 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
246 :returns: the sum of absolute values of the terms in X |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
247 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
248 :math: \sqrt{ \sum_i X_i^2 } |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
249 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
250 Where i is an appropriately dimensioned index. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
251 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
252 """ |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
253 return TT.sqrt((X**2).sum()) |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
254 |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
255 def contrastive_cost(free_energy_fn, pos_v, neg_v): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
256 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
257 :param free_energy_fn: lambda (TensorType matrix MxN) -> TensorType vector of M free energies |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
258 :param pos_v: TensorType matrix MxN of M "positive phase" particles |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
259 :param neg_v: TensorType matrix MxN of M "negative phase" particles |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
260 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
261 :returns: TensorType scalar that's the sum of the difference of free energies |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
262 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
263 :math: \sum_i free_energy(pos_v[i]) - free_energy(neg_v[i]) |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
264 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
265 """ |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
266 return (free_energy_fn(pos_v) - free_energy_fn(neg_v)).sum() |
988
fd243cb2bf0b
mcRBM - moved some things to the top of the file
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
987
diff
changeset
|
267 |
1395
54b2268db0d7
mcRBM.contrastive_grad accepts optional "consider_constant" arg
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1351
diff
changeset
|
268 def contrastive_grad(free_energy_fn, pos_v, neg_v, wrt, other_cost=0, consider_constant=[]): |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
269 """ |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
270 :param free_energy_fn: lambda (TensorType matrix MxN) -> TensorType vector of M free energies |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
271 :param pos_v: positive-phase sample of visible units |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
272 :param neg_v: negative-phase sample of visible units |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
273 :param wrt: TensorType variables with respect to which we want gradients (similar to the |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
274 'wrt' argument to tensor.grad) |
1351
6402b3309ece
mcRBM - added to docstring
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1350
diff
changeset
|
275 :param other_cost: TensorType scalar (should be the sum over a minibatch, not mean) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
276 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
277 :returns: TensorType variables for the gradient on each of the 'wrt' arguments |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
278 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
279 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
280 :math: Cost = other_cost + \sum_i free_energy(pos_v[i]) - free_energy(neg_v[i]) |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
281 :math: d Cost / dW for W in `wrt` |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
282 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
283 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
284 This function is similar to tensor.grad - it returns the gradient[s] on a cost with respect |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
285 to one or more parameters. The difference between tensor.grad and this function is that |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
286 the negative phase term (`neg_v`) is considered constant, i.e. d `Cost` / d `neg_v` = 0. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
287 This is desirable because `neg_v` might be the result of a sampling expression involving |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
288 some of the parameters, but the contrastive divergence algorithm does not call for |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
289 backpropagating through the sampling procedure. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
290 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
291 Warning - if other_cost depends on pos_v or neg_v and you *do* want to backpropagate from |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
292 the `other_cost` through those terms, then this function is inappropriate. In that case, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
293 you should call tensor.grad separately for the other_cost and add the gradient expressions |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
294 you get from ``contrastive_grad(..., other_cost=0)`` |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
295 |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
296 """ |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
297 cost=contrastive_cost(free_energy_fn, pos_v, neg_v) |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
298 if other_cost: |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
299 cost = cost + other_cost |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
300 return theano.tensor.grad(cost, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
301 wrt=wrt, |
1395
54b2268db0d7
mcRBM.contrastive_grad accepts optional "consider_constant" arg
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1351
diff
changeset
|
302 consider_constant=consider_constant+[neg_v]) |
973
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
303 |
1000
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
304 ########################################### |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
305 # |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
306 # Expressions that are mcRBM-specific |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
307 # |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
308 ########################################### |
d4a14c6c36e0
mcRBM - post code-review #1 with Guillaume
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
999
diff
changeset
|
309 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
310 class mcRBM(object): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
311 """Light-weight class that provides the math related to inference |
995
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
312 |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
313 Attributes: |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
314 |
997
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
315 - U - the covariance filters (theano shared variable) |
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
316 - W - the mean filters (theano shared variable) |
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
317 - a - the visible bias (theano shared variable) |
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
318 - b - the covariance bias (theano shared variable) |
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
319 - c - the mean bias (theano shared variable) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
320 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
321 """ |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
322 def __init__(self, U, W, a, b, c): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
323 self.U = U |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
324 self.W = W |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
325 self.a = a |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
326 self.b = b |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
327 self.c = c |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
328 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
329 def hidden_cov_units_preactivation_given_v(self, v, small=0.5): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
330 """Return argument to the sigmoid that would give mean of covariance hid units |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
331 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
332 See the math at the top of this file for what 'adjusted' means. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
333 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
334 return b - 0.5 * dot(adjusted(v), U)**2 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
335 """ |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
336 unit_v = v / (TT.sqrt(TT.mean(v**2, axis=1)+small)).dimshuffle(0,'x') # adjust row norm |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
337 return self.b - 0.5 * dot(unit_v, self.U)**2 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
338 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
339 def free_energy_terms_given_v(self, v): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
340 """Returns theano expression for the terms that are added to form the free energy of |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
341 visible vector `v` in an mcRBM. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
342 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
343 1. Free energy related to covariance hiddens |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
344 2. Free energy related to mean hiddens |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
345 3. Free energy related to L2-Norm of `v` |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
346 4. Free energy related to projection of `v` onto biases `a` |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
347 """ |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
348 t0 = -TT.sum(TT.nnet.softplus(self.hidden_cov_units_preactivation_given_v(v)),axis=1) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
349 t1 = -TT.sum(TT.nnet.softplus(self.c + dot(v,self.W)), axis=1) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
350 t2 = 0.5 * TT.sum(v**2, axis=1) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
351 t3 = -TT.dot(v, self.a) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
352 return [t0, t1, t2, t3] |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
353 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
354 def free_energy_given_v(self, v): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
355 """Returns theano expression for free energy of visible vector `v` in an mcRBM |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
356 """ |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
357 return TT.add(*self.free_energy_terms_given_v(v)) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
358 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
359 def expected_h_g_given_v(self, v): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
360 """Returns tuple (`h`, `g`) of theano expression conditional expectations in an mcRBM. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
361 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
362 `h` is the conditional on the covariance units. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
363 `g` is the conditional on the mean units. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
364 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
365 """ |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
366 h = TT.nnet.sigmoid(self.hidden_cov_units_preactivation_given_v(v)) |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
367 g = TT.nnet.sigmoid(self.c + dot(v,self.W)) |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
368 return (h, g) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
369 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
370 def n_visible_units(self): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
371 """Return the number of visible units of this RBM |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
372 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
373 For an RBM made from shared variables, this will return an integer, |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
374 for a purely symbolic RBM this will return a theano expression. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
375 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
376 """ |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
377 try: |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
378 return self.W.value.shape[0] |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
379 except AttributeError: |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
380 return self.W.shape[0] |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
381 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
382 def n_hidden_cov_units(self): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
383 """Return the number of hidden units for the covariance in this RBM |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
384 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
385 For an RBM made from shared variables, this will return an integer, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
386 for a purely symbolic RBM this will return a theano expression. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
387 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
388 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
389 try: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
390 return self.U.value.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
391 except AttributeError: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
392 return self.U.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
393 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
394 def n_hidden_mean_units(self): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
395 """Return the number of hidden units for the mean in this RBM |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
396 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
397 For an RBM made from shared variables, this will return an integer, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
398 for a purely symbolic RBM this will return a theano expression. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
399 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
400 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
401 try: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
402 return self.W.value.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
403 except AttributeError: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
404 return self.W.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
405 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
406 def CD1_sampler(self, v, n_particles, n_visible=None, rng=8923984): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
407 """Return a symbolic negative-phase particle obtained by simulating the Hamiltonian |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
408 associated with the energy function. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
409 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
410 #TODO: why not expose all HMC arguments somehow? |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
411 if not hasattr(rng, 'randn'): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
412 rng = np.random.RandomState(rng) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
413 if n_visible is None: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
414 n_visible = self.n_visible_units() |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
415 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
416 # create a dummy hmc object because we want to use *some* of it |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
417 hmc = HMC_sampler.new_from_shared_positions( |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
418 shared_positions=v, # v is not shared, so some functionality will not work |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
419 energy_fn=self.free_energy_given_v, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
420 seed=int(rng.randint(2**30)), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
421 shared_positions_shape=(n_particles,n_visible), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
422 compile_simulate=False) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
423 updates = dict(hmc.updates()) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
424 final_p = updates.pop(v) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
425 return hmc, final_p, updates |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
426 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
427 def sampler(self, n_particles, n_visible=None, rng=7823748): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
428 """Return an `HMC_sampler` that will draw samples from the distribution over visible |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
429 units specified by this RBM. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
430 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
431 :param n_particles: this many parallel chains will be simulated. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
432 :param rng: seed or numpy RandomState object to initialize particles, and to drive the simulation. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
433 """ |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
434 #TODO: why not expose all HMC arguments somehow? |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
435 #TODO: Consider returning a sample kwargs for passing to HMC_sampler? |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
436 if not hasattr(rng, 'randn'): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
437 rng = np.random.RandomState(rng) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
438 if n_visible is None: |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
439 n_visible = self.n_visible_units() |
1270
d38cb039c662
debugging mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1267
diff
changeset
|
440 rval = HMC_sampler.new_from_shared_positions( |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
441 shared_positions = sharedX( |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
442 rng.randn( |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
443 n_particles, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
444 n_visible), |
1270
d38cb039c662
debugging mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1267
diff
changeset
|
445 name='particles'), |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
446 energy_fn=self.free_energy_given_v, |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
447 seed=int(rng.randint(2**30))) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
448 return rval |
997
71b0132b694a
mcRBM - removed container logic that was redundant with global methods
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
996
diff
changeset
|
449 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
450 if 0: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
451 def as_feedforward_layer(self, v): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
452 """Return a dictionary with keys: inputs, outputs and params |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
453 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
454 The inputs is [v] |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
455 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
456 The outputs is :math:`[E[h|v], E[g|v]]` where `h` is the covariance hidden units and `g` is |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
457 the mean hidden units. |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
458 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
459 The params are ``[U, W, b, c]``, the model parameters that enter into the conditional |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
460 expectations. |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
461 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
462 :TODO: add an optional parameter to return only one of the expections. |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
463 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
464 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
465 return dict( |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
466 inputs = [v], |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
467 outputs = list(self.expected_h_g_given_v(v)), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
468 params = [self.U, self.W, self.b, self.c], |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
469 ) |
1270
d38cb039c662
debugging mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1267
diff
changeset
|
470 |
1275
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
471 def params(self): |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
472 """Return the elements of [U,W,a,b,c] that are shared variables |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
473 |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
474 WRITEME : a *prescriptive* definition of this method suitable for mention in the API |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
475 doc. |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
476 |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
477 """ |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
478 return list(self._params) |
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
479 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
480 @classmethod |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
481 def alloc(cls, n_I, n_K, n_J, rng = 8923402190, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
482 U_range=0.02, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
483 W_range=0.05, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
484 a_ival=0, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
485 b_ival=2, |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
486 c_ival=-2): |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
487 """ |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
488 Return a MeanCovRBM instance with randomly-initialized shared variable parameters. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
489 |
995
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
490 :param n_I: input dimensionality |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
491 :param n_K: number of covariance hidden units |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
492 :param n_J: number of mean filters (linear) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
493 :param rng: seed or numpy RandomState object to initialize parameters |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
494 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
495 :note: |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
496 Constants for initial ranges and values taken from train_mcRBM.py. |
995
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
497 """ |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
498 if not hasattr(rng, 'randn'): |
68ca3ea34e72
mcRBM - cleaned up new_from_dims
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
994
diff
changeset
|
499 rng = np.random.RandomState(rng) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
500 |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
501 rval = cls( |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
502 U = sharedX(U_range * rng.randn(n_I, n_K),'U'), |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
503 W = sharedX(W_range * rng.randn(n_I, n_J),'W'), |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
504 a = sharedX(np.ones(n_I)*a_ival,'a'), |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
505 b = sharedX(np.ones(n_K)*b_ival,'b'), |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
506 c = sharedX(np.ones(n_J)*c_ival,'c'),) |
1275
f0129e37a8ef
mcRBM - changed params from lambda to method for pickling
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1273
diff
changeset
|
507 rval._params = [rval.U, rval.W, rval.a, rval.b, rval.c] |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
508 return rval |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
509 |
1349
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
510 def topological_connectivity(out_shape=(12,12), window_shape=(3,3), window_stride=(2,2), dtype=None): |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
511 """Return numpy R x C matrix with connection weights from R inputs (filters) to C output (covariance) units. |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
512 """ |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
513 if window_stride[0] != window_stride[1]: |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
514 raise ValueError() |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
515 if out_shape[0] != out_shape[1]: |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
516 raise ValueError() |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
517 if window_shape[0] != window_shape[1]: |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
518 raise ValueError() |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
519 |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
520 if dtype is None: |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
521 dtype = theano.config.floatX |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
522 |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
523 ker = numpy.asarray([1., 2., 1.], dtype=dtype) |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
524 ker = numpy.outer(ker,ker) |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
525 ker /= ker.sum() |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
526 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
527 in_shape = (window_stride[0] * out_shape[0], |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
528 window_stride[1] * out_shape[1]) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
529 |
1349
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
530 tmp = numpy.zeros(in_shape, dtype=dtype) |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
531 rval = numpy.zeros(in_shape + out_shape, dtype=dtype) |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
532 A,B,C,D = rval.shape |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
533 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
534 # for each output position (out_r, out_c) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
535 for out_r in range(out_shape[0]): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
536 for out_c in range(out_shape[1]): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
537 # for each window position (win_r, win_c) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
538 for win_r in range(window_shape[0]): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
539 for win_c in range(window_shape[1]): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
540 # add 1 to the corresponding input location |
1349
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
541 in_r = out_r * window_stride[0] + win_r - (window_shape[0]-1)//2 |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
542 in_c = out_c * window_stride[1] + win_c - (window_shape[1]-1)//2 |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
543 rval[(in_c+A)%A, (in_r+B)%B, out_r%C, out_c%D] = ker[win_r, win_c] |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
544 return rval.reshape((A*B,C*D)) |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
545 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
546 class mcRBM_withP(mcRBM): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
547 """Light-weight class that provides the math related to inference |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
548 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
549 Attributes: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
550 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
551 - U - the covariance filters (theano shared variable) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
552 - W - the mean filters (theano shared variable) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
553 - a - the visible bias (theano shared variable) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
554 - b - the covariance bias (theano shared variable) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
555 - c - the mean bias (theano shared variable) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
556 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
557 """ |
1348
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
558 # PCA eigenvectors (E) [scaled by 1/sqrt(eigenvals)] has to be passed to model |
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
559 # because v has to be divided by the number original dimensions |
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
560 # not the number of pca dimensions in hidden_cov_units_preactivation_given_v() |
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
561 def __init__(self, U, W, a, b, c, P, norm_doctoring): |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
562 self.P = P |
1348
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
563 self.norm_doctoring = norm_doctoring |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
564 super(mcRBM_withP, self).__init__(U,W,a,b,c) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
565 |
1348
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
566 def hidden_cov_units_preactivation_given_v(self, v): |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
567 """Return argument to the sigmoid that would give mean of covariance hid units |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
568 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
569 """ |
1348
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
570 slope, offset = self.norm_doctoring |
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
571 norm_sq = TT.sum(v**2, axis=1) * slope + offset |
c8c30c675a4f
mcRBM - fixed error in normalization of quadratic term
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1322
diff
changeset
|
572 unit_v = v / TT.sqrt(norm_sq).dimshuffle(0,'x') # adjust row norm |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
573 return self.b + 0.5 * dot(dot(unit_v, self.U)**2, self.P) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
574 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
575 def n_hidden_cov_units(self): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
576 """Return the number of hidden units for the covariance in this RBM |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
577 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
578 For an RBM made from shared variables, this will return an integer, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
579 for a purely symbolic RBM this will return a theano expression. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
580 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
581 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
582 try: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
583 return self.P.value.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
584 except AttributeError: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
585 return self.P.shape[1] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
586 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
587 @classmethod |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
588 def alloc(cls, n_I, n_K, n_J, *args, **kwargs): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
589 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
590 Return a MeanCovRBM instance with randomly-initialized shared variable parameters. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
591 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
592 :param n_I: input dimensionality |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
593 :param n_K: number of covariance hidden units |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
594 :param n_J: number of mean filters (linear) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
595 :param rng: seed or numpy RandomState object to initialize parameters |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
596 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
597 :note: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
598 Constants for initial ranges and values taken from train_mcRBM.py. |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
599 """ |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
600 return cls.alloc_with_P( |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
601 -numpy.eye((n_K, n_K)).astype(theano.config.floatX), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
602 n_I, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
603 n_J, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
604 *args, **kwargs) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
605 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
606 @classmethod |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
607 def alloc_topo_P(cls, n_I, n_J, p_out_shape=(12,12), p_win_shape=(3,3), p_win_stride=(2,2), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
608 **kwargs): |
1349
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
609 # the factor of 2 is for some degree of compatibility with Marc'Aurelio's code the 'w2' |
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
610 # in his code is -0.5 * self.P |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
611 return cls.alloc_with_P( |
1349
0d55f8f0aedc
mcRBM - changes to creation of topo_P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1348
diff
changeset
|
612 -2*topological_connectivity(p_out_shape, p_win_shape, p_win_stride), |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
613 n_I=n_I, n_J=n_J, **kwargs) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
614 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
615 @classmethod |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
616 def alloc_with_P(cls, Pval, n_I, n_J, rng = 8923402190, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
617 U_range=0.02, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
618 W_range=0.05, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
619 a_ival=0, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
620 b_ival=2, |
1350
d957155264da
mcRBM - added norm_doctoring parameter
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1349
diff
changeset
|
621 c_ival=-2, |
d957155264da
mcRBM - added norm_doctoring parameter
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1349
diff
changeset
|
622 norm_doctoring=(1.0/192, .5)): |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
623 n_F, n_K = Pval.shape |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
624 if not hasattr(rng, 'randn'): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
625 rng = np.random.RandomState(rng) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
626 rval = cls( |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
627 U = sharedX(U_range * rng.randn(n_I, n_F),'U'), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
628 W = sharedX(W_range * rng.randn(n_I, n_J),'W'), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
629 a = sharedX(np.ones(n_I)*a_ival,'a'), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
630 b = sharedX(np.ones(n_K)*b_ival,'b'), |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
631 c = sharedX(np.ones(n_J)*c_ival,'c'), |
1350
d957155264da
mcRBM - added norm_doctoring parameter
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1349
diff
changeset
|
632 P = sharedX(Pval, 'P'), |
d957155264da
mcRBM - added norm_doctoring parameter
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1349
diff
changeset
|
633 norm_doctoring=norm_doctoring) |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
634 rval._params = [rval.U, rval.W, rval.a, rval.b, rval.c, rval.P] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
635 return rval |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
636 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
637 class mcRBMTrainer(object): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
638 """Light-weight class encapsulating math for mcRBM training |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
639 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
640 Attributes: |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
641 - rbm - an mcRBM instance |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
642 - sampler - an HMC_sampler instance |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
643 - normVF - geometrically updated norm of U matrix columns (shared var) |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
644 - learn_rate - SGD learning rate [un-annealed] |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
645 - learn_rate_multipliers - the learning rates for each of the parameters of the rbm (in |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
646 order corresponding to what's returned by ``rbm.params()``) |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
647 - l1_penalty - float or TensorType scalar to modulate l1 penalty of rbm.U and rbm.W |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
648 - iter - number of cd_updates (shared var) - used to anneal the effective learn_rate |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
649 - lr_anneal_start - scalar or TensorType scalar - iter at which time to start decreasing |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
650 the learning rate proportional to 1/iter |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
651 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
652 """ |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
653 # TODO: accept a GD algo as an argument? |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
654 @classmethod |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
655 def alloc_for_P(cls, rbm, visible_batch, batchsize, initial_lr_per_example=0.075, rng=234, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
656 l1_penalty=0, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
657 l1_penalty_start=0, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
658 learn_rate_multipliers=None, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
659 lr_anneal_start=2000, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
660 p_training_start=4000, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
661 p_training_lr=0.02, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
662 persistent_chains=True |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
663 ): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
664 if learn_rate_multipliers is None: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
665 p_lr = sharedX(0.0, 'P_lr_multiplier') |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
666 learn_rate_multipliers = [2, .2, .02, .1, .02, p_lr] |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
667 else: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
668 p_lr = None |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
669 rval = cls.alloc(rbm, visible_batch, batchsize, initial_lr_per_example, rng, l1_penalty, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
670 l1_penalty_start, learn_rate_multipliers, lr_anneal_start, persistent_chains) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
671 |
1322
cdda4f98c2a2
mcRBM - added mask for updates to P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1287
diff
changeset
|
672 rval.p_mask = sharedX((rbm.P.value!=0).astype('float32'), 'p_mask') |
cdda4f98c2a2
mcRBM - added mask for updates to P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1287
diff
changeset
|
673 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
674 rval.p_lr = p_lr |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
675 rval.p_training_start=p_training_start |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
676 rval.p_training_lr=p_training_lr |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
677 return rval |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
678 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
679 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
680 @classmethod |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
681 def alloc(cls, rbm, visible_batch, batchsize, initial_lr_per_example=0.075, rng=234, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
682 l1_penalty=0, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
683 l1_penalty_start=0, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
684 learn_rate_multipliers=[2, .2, .02, .1, .02], |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
685 lr_anneal_start=2000, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
686 persistent_chains=True |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
687 ): |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
688 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
689 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
690 :param rbm: mcRBM instance to train |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
691 :param visible_batch: TensorType variable for training data |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
692 :param batchsize: the number of rows in visible_batch |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
693 :param initial_lr_per_example: the learning rate (may be annealed) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
694 :param rng: seed or RandomState to initialze PCD sampler |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
695 :param l1_penalty: see class doc |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
696 :param learn_rate_multipliers: see class doc |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
697 :param lr_anneal_start: see class doc |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
698 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
699 #TODO: :param lr_anneal_iter: the iteration at which 1/t annealing will begin |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
700 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
701 #TODO: get batchsize from visible_batch?? |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
702 # allocates shared var for negative phase particles |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
703 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
704 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
705 # TODO: should normVF be initialized to match the size of rbm.U ? |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
706 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
707 if (l1_penalty_start > 0) and (l1_penalty != 0.0): |
1287
4fa2a32e8fde
mcRBM - renamed shared variable
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1284
diff
changeset
|
708 effective_l1_penalty = sharedX(0.0, 'effective_l1_penalty') |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
709 else: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
710 effective_l1_penalty = l1_penalty |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
711 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
712 if persistent_chains: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
713 sampler = rbm.sampler(batchsize, rng=rng) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
714 else: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
715 sampler = None |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
716 |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
717 return cls( |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
718 rbm=rbm, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
719 batchsize=batchsize, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
720 visible_batch=visible_batch, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
721 sampler=sampler, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
722 normVF=sharedX(1.0, 'normVF'), |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
723 learn_rate=sharedX(initial_lr_per_example/batchsize, 'learn_rate'), |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
724 iter=sharedX(0, 'iter'), |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
725 effective_l1_penalty=effective_l1_penalty, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
726 l1_penalty=l1_penalty, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
727 l1_penalty_start=l1_penalty_start, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
728 learn_rate_multipliers=learn_rate_multipliers, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
729 lr_anneal_start=lr_anneal_start, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
730 persistent_chains=persistent_chains,) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
731 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
732 def __init__(self, **kwargs): |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
733 self.__dict__.update(kwargs) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
734 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
735 def normalize_U(self, new_U): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
736 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
737 :param new_U: a proposed new value for rbm.U |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
738 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
739 :returns: a pair of TensorType variables: |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
740 a corrected new value for U, and a new value for self.normVF |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
741 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
742 This is a weird normalization procedure, but the sample code for the paper has it, and |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
743 it seems to be important. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
744 """ |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
745 U_norms = TT.sqrt((new_U**2).sum(axis=0)) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
746 new_normVF = .95 * self.normVF + .05 * TT.mean(U_norms) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
747 return (new_U * new_normVF / U_norms), new_normVF |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
748 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
749 def contrastive_grads(self, neg_v = None): |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
750 """Return the contrastive divergence gradients on the parameters of self.rbm """ |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
751 if neg_v is None: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
752 neg_v = self.sampler.positions |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
753 return contrastive_grad( |
1270
d38cb039c662
debugging mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1267
diff
changeset
|
754 free_energy_fn=self.rbm.free_energy_given_v, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
755 pos_v=self.visible_batch, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
756 neg_v=neg_v, |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
757 wrt = self.rbm.params(), |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
758 other_cost=(l1(self.rbm.U)+l1(self.rbm.W)) * self.effective_l1_penalty) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
759 |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
760 def cd_updates(self): |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
761 """ |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
762 Return a dictionary of shared variable updates that implements contrastive divergence |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
763 learning by stochastic gradient descent with an annealed learning rate. |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
764 """ |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
765 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
766 ups = {} |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
767 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
768 if self.persistent_chains: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
769 grads = self.contrastive_grads() |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
770 ups.update(dict(self.sampler.updates())) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
771 else: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
772 cd1_sampler, final_p, cd1_updates = self.rbm.CD1_sampler(self.visible_batch, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
773 self.batchsize) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
774 self._last_cd1_sampler = cd1_sampler # hacked in here for the unit test |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
775 #ignore the cd1_sampler |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
776 grads = self.contrastive_grads(neg_v = final_p) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
777 ups.update(dict(cd1_updates)) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
778 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
779 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
780 # contrastive divergence updates |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
781 # TODO: sgd_updates is a particular optization algo (others are possible) |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
782 # parametrize so that algo is plugin |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
783 # the normalization normVF might be sgd-specific though... |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
784 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
785 # TODO: when sgd has an annealing schedule, this should |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
786 # go through that mechanism. |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
787 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
788 lr = TT.clip( |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
789 self.learn_rate * TT.cast(self.lr_anneal_start / (self.iter+1), floatX), |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
790 0.0, #min |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
791 self.learn_rate) #max |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
792 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
793 ups.update(dict(sgd_updates( |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
794 self.rbm.params(), |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
795 grads, |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
796 stepsizes=[a*lr for a in self.learn_rate_multipliers]))) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
797 |
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
798 ups[self.iter] = self.iter + 1 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
799 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
800 |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
801 # add trainer updates (replace CD update of U) |
1272
ba25c6e4f55d
mcRBM working with whole learning algo in theano
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1270
diff
changeset
|
802 ups[self.rbm.U], ups[self.normVF] = self.normalize_U(ups[self.rbm.U]) |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
803 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
804 #l1_updates: |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
805 if (self.l1_penalty_start > 0) and (self.l1_penalty != 0.0): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
806 ups[self.effective_l1_penalty] = TT.switch( |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
807 self.iter >= self.l1_penalty_start, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
808 self.l1_penalty, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
809 0.0) |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
810 |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
811 if getattr(self,'p_lr', None): |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
812 ups[self.p_lr] = TT.switch(self.iter > self.p_training_start, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
813 self.p_training_lr, |
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
814 0) |
1322
cdda4f98c2a2
mcRBM - added mask for updates to P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1287
diff
changeset
|
815 new_P = ups[self.rbm.P] * self.p_mask |
cdda4f98c2a2
mcRBM - added mask for updates to P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1287
diff
changeset
|
816 no_pos_P = TT.switch(new_P<0, new_P, 0) |
cdda4f98c2a2
mcRBM - added mask for updates to P matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1287
diff
changeset
|
817 ups[self.rbm.P] = - no_pos_P / no_pos_P.sum(axis=0) #normalize to that columns sum 1 |
1284
1817485d586d
mcRBM - many changes incl. adding support for pooling matrix
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1275
diff
changeset
|
818 |
1267
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
819 return ups |
075c193afd1b
refactoring mcRBM
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
1000
diff
changeset
|
820 |