Mercurial > pylearn
annotate pylearn/algorithms/mcRBM.py @ 984:5badf36a6daf
mcRBM - added notes to leading comment
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Tue, 24 Aug 2010 13:50:26 -0400 |
parents | 2a53384d9742 |
children | 78b5bdf967f6 |
rev | line source |
---|---|
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
1 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
2 This file implements the Mean & Covariance RBM discussed in |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
3 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
4 Ranzato, M. and Hinton, G. E. (2010) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
5 Modeling pixel means and covariances using factored third-order Boltzmann machines. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
6 IEEE Conference on Computer Vision and Pattern Recognition. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
7 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
8 and performs one of the experiments on CIFAR-10 discussed in that paper. There are some minor |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
9 discrepancies between the paper and the accompanying code (train_mcRBM.py), and the |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
10 accompanying code has been taken to be correct in those cases because I couldn't get things to |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
11 work otherwise. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
12 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
13 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
14 Math |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
15 ==== |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
16 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
17 Energy of "covariance RBM" |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
18 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
19 E = -0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
20 = -0.5 \sum_f (\sum_k P_{fk} h_k) ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
21 "vector element f" "vector element f" |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
22 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
23 In some parts of the paper, the P matrix is chosen to be a diagonal matrix with non-positive |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
24 diagonal entries, so it is helpful to see this as a simpler equation: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
25 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
26 E = \sum_f h_f ( \sum_i C_{if} v_i )^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
27 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
28 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
29 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
30 Version in paper |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
31 ---------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
32 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
33 Full Energy of the Mean and Covariance RBM, with |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
34 :math:`h_k = h_k^{(c)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
35 :math:`g_j = h_j^{(m)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
36 :math:`b_k = b_k^{(c)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
37 :math:`c_j = b_j^{(m)}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
38 :math:`U_{if} = C_{if}`, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
39 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
40 E (v, h, g) = |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
41 - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i (U_{if} v_i) / |U_{.f}|*|v| )^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
42 - \sum_k b_k h_k |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
43 + 0.5 \sum_i v_i^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
44 - \sum_j \sum_i W_{ij} g_j v_i |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
45 - \sum_j c_j g_j |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
46 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
47 For the energy function to correspond to a probability distribution, P must be non-positive. P |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
48 is initialized to be a diagonal, and in our experience it can be left as such because even in |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
49 the paper it has a very low learning rate, and is only allowed to be updated after the filters |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
50 in U are learned (in effect). |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
51 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
52 Version in published train_mcRBM code |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
53 ------------------------------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
54 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
55 The train_mcRBM file implements learning in a similar but technically different Energy function: |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
56 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
57 E (v, h, g) = |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
58 - 0.5 \sum_f \sum_k P_{fk} h_k (\sum_i U_{if} v_i / sqrt(\sum_i v_i^2/I + 0.5))^2 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
59 - \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
60 + 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
61 - \sum_j \sum_i W_{ij} g_j v_i |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
62 - \sum_j c_j g_j |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
63 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
64 There are two differences with respect to the paper: |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
65 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
66 - 'v' is not normalized by its length, but rather it is normalized to have length close to |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
67 the square root of the number of its components. The variable called 'small' that |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
68 "avoids division by zero" is orders larger than machine precision, and is on the order of |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
69 the normalized sum-of-squares, so I've included it in the Energy function. |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
70 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
71 - 'U' is also not normalized by its length. U is initialized to have columns that are |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
72 shorter than unit-length (approximately 0.2 with the 105 principle components in the |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
73 train_mcRBM data). During training, the columns of U are constrained manually to have |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
74 equal lengths (see the use of normVF), but Euclidean norm is allowed to change. During |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
75 learning it quickly converges towards 1 and then exceeds 1. It does not seem like this |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
76 column-wise normalization of U is justified by maximum-likelihood, I have no intuition |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
77 for why it is used. |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
78 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
79 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
80 Version in this code |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
81 -------------------- |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
82 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
83 This file implements the same algorithm as the train_mcRBM code, except that the P matrix is |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
84 omitted for clarity, and replaced analytically with a negative identity matrix. |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
85 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
86 E (v, h, g) = |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
87 + 0.5 \sum_k h_k (\sum_i U_{ik} v_i / sqrt(\sum_i v_i^2/I + 0.5))^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
88 - \sum_k b_k h_k |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
89 + 0.5 \sum_i v_i^2 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
90 - \sum_j \sum_i W_{ij} g_j v_i |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
91 - \sum_j c_j g_j |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
92 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
93 |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
94 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
95 Conventions in this file |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
96 ======================== |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
97 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
98 This file contains some global functions, as well as a class (MeanCovRBM) that makes using them a little |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
99 more convenient. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
100 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
101 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
102 Global functions like `free_energy` work on an mcRBM as parametrized in a particular way. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
103 Suppose we have |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
104 I input dimensions, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
105 F squared filters, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
106 J mean variables, and |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
107 K covariance variables. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
108 The mcRBM is parametrized by 5 variables: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
109 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
110 - `P`, a matrix (probably sparse) of pooling (F x K) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
111 - `U`, a matrix whose rows are visible covariance directions (I x F) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
112 - `W`, a matrix whose rows are visible mean directions (I x J) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
113 - `b`, a vector of hidden covariance biases (K) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
114 - `c`, a vector of hidden mean biases (J) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
115 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
116 Matrices are generally layed out and accessed according to a C-order convention. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
117 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
118 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
119 |
984
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
120 # |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
121 # WORKING NOTES |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
122 # THIS DERIVATION IS BASED ON THE ** PAPER ** ENERGY FUNCTION |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
123 # NOT THE ENERGY FUNCTION IN THE CODE!!! |
5badf36a6daf
mcRBM - added notes to leading comment
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
979
diff
changeset
|
124 # |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
125 # Free energy is the marginal energy of visible units |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
126 # Recall: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
127 # Q(x) = exp(-E(x))/Z ==> -log(Q(x)) - log(Z) = E(x) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
128 # |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
129 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
130 # E (v, h, g) = |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
131 # - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / |U_{*f}|^2 |v|^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
132 # - \sum_k b_k h_k |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
133 # + 0.5 \sum_i v_i^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
134 # - \sum_j \sum_i W_{ij} g_j v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
135 # - \sum_j c_j g_j |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
136 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
137 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
138 # |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
139 # Derivation, in which partition functions are ignored. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
140 # |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
141 # E(v) = -\log(Q(v)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
142 # = -\log( \sum_{h,g} Q(v,h,g)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
143 # = -\log( \sum_{h,g} exp(-E(v,h,g))) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
144 # = -\log( \sum_{h,g} exp(- |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
145 # - 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}| * |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
146 # - \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
147 # + 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
148 # - \sum_j \sum_i W_{ij} g_j v_i |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
149 # - \sum_j c_j g_j |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
150 # - \sum_i a_i v_i )) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
151 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
152 # Get rid of double negs in exp |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
153 # = -\log( \sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
154 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}| * |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
155 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
156 # - 0.5 \sum_i v_i^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
157 # ) * \sum_{g} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
158 # + \sum_j \sum_i W_{ij} g_j v_i |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
159 # + \sum_j c_j g_j)) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
160 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
161 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
162 # Break up log |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
163 # = -\log( \sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
164 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}|*|v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
165 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
166 # )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
167 # -\log( \sum_{g} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
168 # + \sum_j \sum_i W_{ij} g_j v_i |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
169 # + \sum_j c_j g_j ))) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
170 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
171 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
172 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
173 # Use domain h is binary to turn log(sum(exp(sum...))) into sum(log(.. |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
174 # = -\log(\sum_{h} exp( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
175 # + 0.5 \sum_f \sum_k P_{fk} h_k ( \sum_i U_{if} v_i )^2 / (|U_{*f}|* |v|) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
176 # + \sum_k b_k h_k |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
177 # )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
178 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
179 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
180 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
181 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
182 # = - \sum_{k} \log(1 + exp(b_k + 0.5 \sum_f P_{fk}( \sum_i U_{if} v_i )^2 / (|U_{*f}|*|v|))) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
183 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
184 # + 0.5 \sum_i v_i^2 |
972
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
185 # - \sum_i a_i v_i |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
186 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
187 # For negative-one-diagonal P this gives: |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
188 # |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
189 # = - \sum_{k} \log(1 + exp(b_k - 0.5 \sum_i (U_{ik} v_i )^2 / (|U_{*k}|*|v|))) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
190 # - \sum_{j} \log(1 + exp(\sum_i W_{ij} v_i + c_j )) |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
191 # + 0.5 \sum_i v_i^2 |
0b392d1401c5
mcRBM - adding math and comments
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
967
diff
changeset
|
192 # - \sum_i a_i v_i |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
193 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
194 import sys |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
195 import logging |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
196 import numpy as np |
973
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
197 import numpy |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
198 from theano import function, shared, dot |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
199 from theano import tensor as TT |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
200 import theano.sparse #installs the sparse shared var handler |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
201 floatX = theano.config.floatX |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
202 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
203 from pylearn.sampling.hmc import HMC_sampler |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
204 from pylearn.io import image_tiling |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
205 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
206 from sparse_coding import numpy_project_onto_ball |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
207 |
973
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
208 print >> sys.stderr, "mcRBM IS NOT READY YET" |
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
209 |
aa201f357d7b
mcRBM - added numpy import
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
972
diff
changeset
|
210 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
211 #TODO: This should be in the nnet part of the library |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
212 def sgd_updates(params, grads, lr): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
213 try: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
214 float(lr) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
215 lr = [lr for p in params] |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
216 except TypeError: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
217 pass |
974
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
218 updates = [(p, p - plr * gp) for (plr, p, gp) in zip(lr, params, grads)] |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
219 return updates |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
220 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
221 def as_shared(x, name=None, dtype=floatX): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
222 if hasattr(x, 'type'): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
223 return x |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
224 else: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
225 if 'float' in str(x.dtype): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
226 return shared(x.astype(floatX), name=name) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
227 else: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
228 return shared(x, name=name) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
229 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
230 def hidden_cov_units_preactivation_given_v(rbm, v, small=1e-8): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
231 (U,W,a,b,c) = rbm |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
232 unit_v = v / (TT.sqrt(TT.sum(v**2, axis=1))+small).dimshuffle(0,'x') # unit rows |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
233 unit_U = U # assuming unit cols! |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
234 #unit_U = U / (TT.sqrt(TT.sum(U**2, axis=0))+small) #unit cols |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
235 return b - 0.5 * dot(unit_v, unit_U)**2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
236 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
237 def free_energy_given_v(rbm, v): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
238 """Returns theano expression for free energy of visible vector `v` in an mcRBM |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
239 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
240 An mcRBM is parametrized |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
241 by `U`, `W`, `b`, `c`. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
242 See module - level documentation for explanations of the `U`, `W`, `b` and `c` parameters. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
243 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
244 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
245 The free energy of v is what we need for learning and hybrid Monte-carlo negative-phase |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
246 sampling. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
247 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
248 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
249 U, W, a, b, c = rbm |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
250 |
975
38e66e0da66a
mcRBM - put softplus in directly for num. stability
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
974
diff
changeset
|
251 t0 = -TT.sum(TT.nnet.softplus(hidden_cov_units_preactivation_given_v(rbm, v)),axis=1) |
38e66e0da66a
mcRBM - put softplus in directly for num. stability
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
974
diff
changeset
|
252 t1 = -TT.sum(TT.nnet.softplus(c + dot(v,W)), axis=1) |
38e66e0da66a
mcRBM - put softplus in directly for num. stability
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
974
diff
changeset
|
253 t2 = 0.5 * TT.sum(v**2, axis=1) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
254 t3 = -TT.dot(v, a) |
976
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
255 return t0 + t1 + t2 + t3, (t0, t1, t2, t3) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
256 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
257 def expected_h_g_given_v(P, U, W, b, c, v): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
258 """Returns theano expression conditional expectations (`h`, `g`) in an mcRBM. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
259 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
260 An mcRBM is parametrized |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
261 by `U`, `W`, `b`, `c`. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
262 See module - level documentation for explanations of the `U`, `W`, `b` and `c` parameters. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
263 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
264 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
265 The conditional E[h, g | v] is what we need to classify images. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
266 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
267 raise NotImplementedError() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
268 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
269 #TODO: check to see if these args should be negated? |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
270 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
271 if P is None: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
272 h = nnet.sigmoid(b + 0.5 * cosines(v,U)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
273 else: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
274 h = nnet.sigmoid(b + 0.5 * dot(cosines(v,U), P)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
275 g = nnet.sigmoid(c + dot(v,W)) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
276 return (h, g) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
277 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
278 class MeanCovRBM(object): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
279 """Container for mcRBM parameters that gives more convenient access to mcRBM methods. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
280 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
281 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
282 params = property(lambda s: [s.U, s.W, s.a, s.b, s.c]) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
283 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
284 n_visible = property(lambda s: s.W.value.shape[0]) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
285 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
286 def __init__(self, U, W, a, b, c): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
287 self.U = as_shared(U, 'U') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
288 self.W = as_shared(W, 'W') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
289 self.a = as_shared(a, 'a') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
290 self.b = as_shared(b, 'b') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
291 self.c = as_shared(c, 'c') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
292 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
293 assert self.b.type.dtype == 'float32' |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
294 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
295 @classmethod |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
296 def new_from_dims(cls, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
297 n_I, # input dimensionality |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
298 n_K, # number of covariance hidden units |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
299 n_F, # number of covariance filters (squared) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
300 n_J, # number of mean filters (linear) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
301 seed = 8923402190, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
302 ): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
303 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
304 Return a MeanCovRBM instance with randomly-initialized parameters. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
305 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
306 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
307 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
308 if 0: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
309 if P_init == 'diag': |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
310 if n_K != n_F: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
311 raise ValueError('cannot use diagonal initialization of non-square P matrix') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
312 import scipy.sparse |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
313 P = -scipy.sparse.identity(n_K).tocsr() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
314 else: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
315 raise NotImplementedError() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
316 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
317 rng = np.random.RandomState(seed) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
318 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
319 # initialization taken from Marc'Aurelio |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
320 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
321 return cls( |
977
9cac1ecaeef7
mcRBM - changed init of U to match M'A.R's code
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
976
diff
changeset
|
322 #U = numpy_project_onto_ball(rng.randn(n_I, n_F).T).T, |
9cac1ecaeef7
mcRBM - changed init of U to match M'A.R's code
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
976
diff
changeset
|
323 U = 0.2 * rng.randn(n_I, n_F), |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
324 W = rng.randn(n_I, n_J)/np.sqrt((n_I+n_J)/2), |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
325 a = np.ones(n_I)*(-2), |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
326 b = np.ones(n_K)*2, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
327 c = np.zeros(n_J),) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
328 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
329 def __getstate__(self): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
330 # unpack shared containers, which may have references to Theano stuff |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
331 # and are not a long-term stable data type. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
332 return dict( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
333 U = self.U.value, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
334 W = self.W.value, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
335 b = self.b.value, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
336 c = self.c.value) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
337 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
338 def __setstate__(self, dct): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
339 self.__init__(**dct) # calls as_shared on pickled arrays |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
340 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
341 def hmc_sampler(self, n_particles=100, seed=7823748): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
342 return HMC_sampler( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
343 positions = [as_shared( |
978
ab4bc97ca060
mcRBM - particles initialized w randn instead of rand()
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
977
diff
changeset
|
344 np.random.RandomState(seed^20893).randn( |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
345 n_particles, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
346 self.n_visible ))], |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
347 energy_fn = lambda p : self.free_energy_given_v(p[0]), |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
348 seed=seed) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
349 |
976
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
350 def free_energy_given_v(self, v, extra=False): |
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
351 rval = free_energy_given_v(self.params, v) |
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
352 if extra: |
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
353 return rval |
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
354 else: |
4cbd65cf902d
mcRBM - added extra free_energy param
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
975
diff
changeset
|
355 return rval[0] |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
356 |
974
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
357 def contrastive_gradient(self, pos_v, neg_v, U_l1_penalty=0, W_l1_penalty=0): |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
358 """Return a list of gradient expressions for self.params |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
359 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
360 :param pos_v: positive-phase sample of visible units |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
361 :param neg_v: negative-phase sample of visible units |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
362 """ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
363 pos_FE = self.free_energy_given_v(pos_v) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
364 neg_FE = self.free_energy_given_v(neg_v) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
365 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
366 gpos_FE = theano.tensor.grad(pos_FE.sum(), self.params) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
367 gneg_FE = theano.tensor.grad(neg_FE.sum(), self.params) |
974
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
368 rval = [ gp - gn for (gp,gn) in zip(gpos_FE, gneg_FE)] |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
369 rval[0] = rval[0] - TT.sign(self.U)*U_l1_penalty |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
370 rval[1] = rval[1] - TT.sign(self.W)*W_l1_penalty |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
371 return rval |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
372 |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
373 from pylearn.dataset_ops.protocol import TensorFnDataset |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
374 from pylearn.dataset_ops.memo import memo |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
375 import scipy.io |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
376 @memo |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
377 def load_mcRBM_demo_patches(): |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
378 d = scipy.io.loadmat('/u/bergstrj/cvs/articles/2010/spike_slab_RBM/src/marcaurelio/training_colorpatches_16x16_demo.mat') |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
379 totnumcases = d["whitendata"].shape[0] |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
380 #d = d["whitendata"][0:np.floor(totnumcases/batch_size)*batch_size,:].copy() |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
381 d = d["whitendata"].copy() |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
382 return d |
f2cdcc71ece1
mcRBM - added L1 penalties and normal sign convention to contrastive grad
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
973
diff
changeset
|
383 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
384 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
385 if __name__ == '__main__': |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
386 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
387 print >> sys.stderr, "TODO: use P matrix (aka FH matrix)" |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
388 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
389 dataset='MAR' |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
390 if dataset == 'MAR': |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
391 R,C= 21,5 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
392 n_patches=10240 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
393 demodata = scipy.io.loadmat('/u/bergstrj/cvs/articles/2010/spike_slab_RBM/src/marcaurelio/training_colorpatches_16x16_demo.mat') |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
394 else: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
395 R,C= 16,16 # the size of image patches |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
396 n_patches=100000 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
397 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
398 n_train_iters=30000 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
399 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
400 n_burnin_steps=10000 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
401 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
402 l1_penalty=1e-3 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
403 no_l1_epochs = 10 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
404 effective_l1_penalty=0.0 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
405 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
406 epoch_size=50000 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
407 batchsize = 128 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
408 lr = 0.075 / batchsize |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
409 s_lr = TT.scalar() |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
410 s_l1_penalty=TT.scalar() |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
411 n_K=256 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
412 n_F=256 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
413 n_J=100 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
414 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
415 rbm = MeanCovRBM.new_from_dims(n_I=R*C, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
416 n_K=n_K, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
417 n_J=n_J, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
418 n_F=n_F, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
419 ) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
420 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
421 sampler = rbm.hmc_sampler(n_particles=batchsize) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
422 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
423 def l2(X): |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
424 return (X**2).sum() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
425 def tile(X, fname): |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
426 if dataset == 'MAR': |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
427 X = np.dot(X, demodata['invpcatransf'].T) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
428 R=16 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
429 C=16 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
430 #X = X.reshape((X.shape[0], 3, 16, 16)).transpose([0,2,3,1]).copy() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
431 X = (X[:,:256], X[:,256:512], X[:,512:], None) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
432 _img = image_tiling.tile_raster_images(X, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
433 img_shape=(R,C), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
434 min_dynamic_range=1e-2) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
435 image_tiling.save_tiled_raster_images(_img, fname) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
436 #print "Burning in..." |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
437 #for burnin in xrange(n_burnin_steps): |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
438 #sampler.simulate() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
439 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
440 if 0: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
441 print "Just SAMPLING..." |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
442 for jj in xrange(n_burnin_steps): |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
443 if 0 == jj % 100: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
444 tile(sampler.positions[0].value, "sampler_%06i.png"%jj) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
445 tile(numpy.random.randn(100, 105), "random_%06i.png"%jj) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
446 print "burning in... ", jj |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
447 sys.stdout.flush() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
448 sampler.simulate() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
449 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
450 sys.exit() |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
451 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
452 batch_idx = TT.iscalar() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
453 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
454 if 0: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
455 from pylearn.dataset_ops import image_patches |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
456 train_batch = image_patches.image_patches( |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
457 s_idx = (batch_idx * batchsize + np.arange(batchsize)), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
458 dims = (n_patches,R,C), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
459 center=True, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
460 unitvar=True, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
461 dtype=floatX, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
462 rasterized=True) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
463 else: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
464 op = TensorFnDataset(floatX, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
465 bcast=(False,), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
466 fn=load_mcRBM_demo_patches, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
467 single_shape=(105,)) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
468 train_batch = op((batch_idx * batchsize + np.arange(batchsize))%n_patches) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
469 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
470 imgs_fn = function([batch_idx], outputs=train_batch) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
471 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
472 grads = rbm.contrastive_gradient( |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
473 pos_v=train_batch, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
474 neg_v=sampler.positions[0], |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
475 U_l1_penalty=s_l1_penalty, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
476 W_l1_penalty=s_l1_penalty) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
477 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
478 learn_fn = function([batch_idx, s_lr, s_l1_penalty], |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
479 outputs=[ |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
480 grads[0].norm(2), |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
481 rbm.free_energy_given_v(train_batch).sum(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
482 rbm.free_energy_given_v(train_batch,extra=1)[1][0].sum(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
483 rbm.free_energy_given_v(train_batch,extra=1)[1][1].sum(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
484 rbm.free_energy_given_v(train_batch,extra=1)[1][2].sum(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
485 rbm.free_energy_given_v(train_batch,extra=1)[1][3].sum(), |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
486 ], |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
487 updates = sgd_updates( |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
488 rbm.params, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
489 grads, |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
490 lr=[2*s_lr, .2*s_lr, .02*s_lr, .1*s_lr, .02*s_lr ])) |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
491 theano.printing.pydotprint(learn_fn, 'learn_fn.png') |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
492 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
493 print "Learning..." |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
494 normVF=1 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
495 for jj in xrange(n_train_iters): |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
496 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
497 print_jj = ((1 and jj < 100) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
498 or (0 and jj < 100 and 0==jj%10) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
499 or (jj < 1000 and 0==jj%100) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
500 or (1 and jj < 10000 and 0==jj%1000)) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
501 |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
502 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
503 if print_jj: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
504 tile(imgs_fn(jj), "imgs_%06i.png"%jj) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
505 tile(sampler.positions[0].value, "sample_%06i.png"%jj) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
506 tile(rbm.U.value.T, "U_%06i.png"%jj) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
507 tile(rbm.W.value.T, "W_%06i.png"%jj) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
508 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
509 print 'saving samples', jj, 'epoch', jj/(epoch_size/batchsize), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
510 print 'l2(U)', l2(rbm.U.value), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
511 print 'l2(W)', l2(rbm.W.value), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
512 print 'U min max', rbm.U.value.min(), rbm.U.value.max(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
513 print 'W min max', rbm.W.value.min(), rbm.W.value.max(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
514 print 'a min max', rbm.a.value.min(), rbm.a.value.max(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
515 print 'b min max', rbm.b.value.min(), rbm.b.value.max(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
516 print 'c min max', rbm.c.value.min(), rbm.c.value.max(), |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
517 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
518 print 'parts min', sampler.positions[0].value.min(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
519 print 'max',sampler.positions[0].value.max(), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
520 print 'HMC step', sampler.stepsize, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
521 print 'arate', sampler.avg_acceptance_rate |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
522 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
523 sampler.simulate() |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
524 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
525 l2_of_Ugrad = learn_fn(jj, |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
526 lr/max(1, jj/(20*epoch_size/batchsize)), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
527 effective_l1_penalty) |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
528 |
979
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
529 if print_jj: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
530 print 'l2(gU)', float(l2_of_Ugrad[0]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
531 print 'FE+', float(l2_of_Ugrad[1]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
532 print 'FE+[0]', float(l2_of_Ugrad[2]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
533 print 'FE+[1]', float(l2_of_Ugrad[3]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
534 print 'FE+[2]', float(l2_of_Ugrad[4]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
535 print 'FE+[3]', float(l2_of_Ugrad[5]), |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
536 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
537 if jj == no_l1_epochs * epoch_size/batchsize: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
538 print "Activating L1 weight decay" |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
539 effective_l1_penalty = 1e-3 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
540 |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
541 if 0: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
542 rbm.U.value = numpy_project_onto_ball(rbm.U.value.T).T |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
543 else: |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
544 # weird normalization technique... |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
545 # It constrains all the columns of the matrix to have the same length |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
546 # But the matrix itself is re-scaled to have an arbitrary abslute size. |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
547 U = rbm.U.value |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
548 U_norms = np.sqrt((U*U).sum(axis=0)) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
549 assert len(U_norms) == n_F |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
550 normVF = .95 * normVF + .05 * np.mean(U_norms) |
2a53384d9742
mcRBM - hacks to driver
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
978
diff
changeset
|
551 rbm.U.value = rbm.U.value * normVF/U_norms |
967
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
552 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
553 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
554 # |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
555 # |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
556 # Marc'Aurelio Ranzato's code |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
557 # |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
558 ###################################################################### |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
559 # compute the value of the free energy at a given input |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
560 # F = - sum log(1+exp(- .5 FH (VF data/norm(data))^2 + bias_cov)) +... |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
561 # - sum log(1+exp(w_mean data + bias_mean)) + ... |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
562 # - bias_vis data + 0.5 data^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
563 # NOTE: FH is constrained to be positive |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
564 # (in the paper the sign is negative but the sign in front of it is also flipped) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
565 def compute_energy_mcRBM(data,normdata,vel,energy,VF,FH,bias_cov,bias_vis,w_mean,bias_mean,t1,t2,t6,feat,featsq,feat_mean,length,lengthsq,normcoeff,small,num_vis): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
566 # normalize input data vectors |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
567 data.mult(data, target = t6) # DxP (nr input dims x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
568 t6.sum(axis = 0, target = lengthsq) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
569 lengthsq.mult(0.5, target = energy) # energy of quadratic regularization term |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
570 lengthsq.mult(1./num_vis) # normalize by number of components (like std) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
571 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
572 lengthsq.add(small) # small prevents division by 0 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
573 # energy_j = \sum_i 0.5 data_ij ^2 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
574 # lengthsq_j = 1/ (\sum_i data_ij ^2 + small) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
575 cmt.sqrt(lengthsq, target = length) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
576 # length_j = sqrt(lengthsq_j) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
577 length.reciprocal(target = normcoeff) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
578 # normcoef_j = 1/sqrt(lengthsq_j) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
579 data.mult_by_row(normcoeff, target = normdata) # normalized data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
580 # normdata is like data, but cols have unit L2 norm |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
581 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
582 ## potential |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
583 # covariance contribution |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
584 cmt.dot(VF.T, normdata, target = feat) # HxP (nr factors x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
585 feat.mult(feat, target = featsq) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
586 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
587 # featsq is the squared cosines (VF with data) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
588 cmt.dot(FH.T,featsq, target = t1) # OxP (nr cov hiddens x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
589 t1.mult(-0.5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
590 t1.add_col_vec(bias_cov) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
591 cmt.exp(t1) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
592 t1.add(1, target = t2) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
593 cmt.log(t2) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
594 t2.mult(-1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
595 energy.add_sums(t2, axis=0) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
596 # mean contribution |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
597 cmt.dot(w_mean.T, data, target = feat_mean) # HxP (nr mean hiddens x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
598 feat_mean.add_col_vec(bias_mean) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
599 cmt.exp(feat_mean) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
600 feat_mean.add(1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
601 cmt.log(feat_mean) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
602 feat_mean.mult(-1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
603 energy.add_sums(feat_mean, axis=0) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
604 # visible bias term |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
605 data.mult_by_col(bias_vis, target = t6) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
606 t6.mult(-1) # DxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
607 energy.add_sums(t6, axis=0) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
608 # kinetic |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
609 vel.mult(vel, target = t6) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
610 energy.add_sums(t6, axis = 0, mult = .5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
611 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
612 ###################################################### |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
613 # mcRBM trainer: sweeps over the training set. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
614 # For each batch of samples compute derivatives to update the parameters |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
615 # at the training samples and at the negative samples drawn calling HMC sampler. |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
616 def train_mcRBM(): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
617 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
618 config = ConfigParser() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
619 config.read('input_configuration') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
620 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
621 verbose = config.getint('VERBOSITY','verbose') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
622 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
623 num_epochs = config.getint('MAIN_PARAMETER_SETTING','num_epochs') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
624 batch_size = config.getint('MAIN_PARAMETER_SETTING','batch_size') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
625 startFH = config.getint('MAIN_PARAMETER_SETTING','startFH') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
626 startwd = config.getint('MAIN_PARAMETER_SETTING','startwd') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
627 doPCD = config.getint('MAIN_PARAMETER_SETTING','doPCD') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
628 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
629 # model parameters |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
630 num_fac = config.getint('MODEL_PARAMETER_SETTING','num_fac') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
631 num_hid_cov = config.getint('MODEL_PARAMETER_SETTING','num_hid_cov') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
632 num_hid_mean = config.getint('MODEL_PARAMETER_SETTING','num_hid_mean') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
633 apply_mask = config.getint('MODEL_PARAMETER_SETTING','apply_mask') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
634 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
635 # load data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
636 data_file_name = config.get('DATA','data_file_name') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
637 d = loadmat(data_file_name) # input in the format PxD (P vectorized samples with D dimensions) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
638 totnumcases = d["whitendata"].shape[0] |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
639 d = d["whitendata"][0:floor(totnumcases/batch_size)*batch_size,:].copy() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
640 totnumcases = d.shape[0] |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
641 num_vis = d.shape[1] |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
642 num_batches = int(totnumcases/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
643 dev_dat = cmt.CUDAMatrix(d.T) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
644 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
645 # training parameters |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
646 epsilon = config.getfloat('OPTIMIZER_PARAMETERS','epsilon') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
647 epsilonVF = 2*epsilon |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
648 epsilonFH = 0.02*epsilon |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
649 epsilonb = 0.02*epsilon |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
650 epsilonw_mean = 0.2*epsilon |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
651 epsilonb_mean = 0.1*epsilon |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
652 weightcost_final = config.getfloat('OPTIMIZER_PARAMETERS','weightcost_final') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
653 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
654 # HMC setting |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
655 hmc_step_nr = config.getint('HMC_PARAMETERS','hmc_step_nr') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
656 hmc_step = 0.01 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
657 hmc_target_ave_rej = config.getfloat('HMC_PARAMETERS','hmc_target_ave_rej') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
658 hmc_ave_rej = hmc_target_ave_rej |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
659 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
660 # initialize weights |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
661 VF = cmt.CUDAMatrix(np.array(0.02 * np.random.randn(num_vis, num_fac), dtype=np.float32, order='F')) # VxH |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
662 if apply_mask == 0: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
663 FH = cmt.CUDAMatrix( np.array( np.eye(num_fac,num_hid_cov), dtype=np.float32, order='F') ) # HxO |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
664 else: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
665 dd = loadmat('your_FHinit_mask_file.mat') # see CVPR2010paper_material/topo2D_3x3_stride2_576filt.mat for an example |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
666 FH = cmt.CUDAMatrix( np.array( dd["FH"], dtype=np.float32, order='F') ) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
667 bias_cov = cmt.CUDAMatrix( np.array(2.0*np.ones((num_hid_cov, 1)), dtype=np.float32, order='F') ) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
668 bias_vis = cmt.CUDAMatrix( np.array(np.zeros((num_vis, 1)), dtype=np.float32, order='F') ) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
669 w_mean = cmt.CUDAMatrix( np.array( 0.05 * np.random.randn(num_vis, num_hid_mean), dtype=np.float32, order='F') ) # VxH |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
670 bias_mean = cmt.CUDAMatrix( np.array( -2.0*np.ones((num_hid_mean,1)), dtype=np.float32, order='F') ) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
671 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
672 # initialize variables to store derivatives |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
673 VFinc = cmt.CUDAMatrix( np.array(np.zeros((num_vis, num_fac)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
674 FHinc = cmt.CUDAMatrix( np.array(np.zeros((num_fac, num_hid_cov)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
675 bias_covinc = cmt.CUDAMatrix( np.array(np.zeros((num_hid_cov, 1)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
676 bias_visinc = cmt.CUDAMatrix( np.array(np.zeros((num_vis, 1)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
677 w_meaninc = cmt.CUDAMatrix( np.array(np.zeros((num_vis, num_hid_mean)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
678 bias_meaninc = cmt.CUDAMatrix( np.array(np.zeros((num_hid_mean, 1)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
679 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
680 # initialize temporary storage |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
681 data = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
682 normdata = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
683 negdataini = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
684 feat = cmt.CUDAMatrix( np.array(np.empty((num_fac, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
685 featsq = cmt.CUDAMatrix( np.array(np.empty((num_fac, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
686 negdata = cmt.CUDAMatrix( np.array(np.random.randn(num_vis, batch_size), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
687 old_energy = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
688 new_energy = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
689 gradient = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
690 normgradient = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) # VxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
691 thresh = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
692 feat_mean = cmt.CUDAMatrix( np.array(np.empty((num_hid_mean, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
693 vel = cmt.CUDAMatrix( np.array(np.random.randn(num_vis, batch_size), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
694 length = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
695 lengthsq = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
696 normcoeff = cmt.CUDAMatrix( np.array(np.zeros((1, batch_size)), dtype=np.float32, order='F')) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
697 if apply_mask==1: # this used to constrain very large FH matrices only allowing to change values in a neighborhood |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
698 dd = loadmat('your_FHinit_mask_file.mat') |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
699 mask = cmt.CUDAMatrix( np.array(dd["mask"], dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
700 normVF = 1 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
701 small = 0.5 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
702 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
703 # other temporary vars |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
704 t1 = cmt.CUDAMatrix( np.array(np.empty((num_hid_cov, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
705 t2 = cmt.CUDAMatrix( np.array(np.empty((num_hid_cov, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
706 t3 = cmt.CUDAMatrix( np.array(np.empty((num_fac, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
707 t4 = cmt.CUDAMatrix( np.array(np.empty((1,batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
708 t5 = cmt.CUDAMatrix( np.array(np.empty((1,1)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
709 t6 = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
710 t7 = cmt.CUDAMatrix( np.array(np.empty((num_vis, batch_size)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
711 t8 = cmt.CUDAMatrix( np.array(np.empty((num_vis, num_fac)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
712 t9 = cmt.CUDAMatrix( np.array(np.zeros((num_fac, num_hid_cov)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
713 t10 = cmt.CUDAMatrix( np.array(np.empty((1,num_fac)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
714 t11 = cmt.CUDAMatrix( np.array(np.empty((1,num_hid_cov)), dtype=np.float32, order='F')) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
715 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
716 # start training |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
717 for epoch in range(num_epochs): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
718 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
719 print "Epoch " + str(epoch + 1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
720 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
721 # anneal learning rates |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
722 epsilonVFc = epsilonVF/max(1,epoch/20) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
723 epsilonFHc = epsilonFH/max(1,epoch/20) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
724 epsilonbc = epsilonb/max(1,epoch/20) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
725 epsilonw_meanc = epsilonw_mean/max(1,epoch/20) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
726 epsilonb_meanc = epsilonb_mean/max(1,epoch/20) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
727 weightcost = weightcost_final |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
728 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
729 if epoch <= startFH: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
730 epsilonFHc = 0 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
731 if epoch <= startwd: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
732 weightcost = 0 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
733 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
734 for batch in range(num_batches): |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
735 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
736 # get current minibatch |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
737 data = dev_dat.slice(batch*batch_size,(batch + 1)*batch_size) # DxP (nr dims x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
738 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
739 # normalize input data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
740 data.mult(data, target = t6) # DxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
741 t6.sum(axis = 0, target = lengthsq) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
742 lengthsq.mult(1./num_vis) # normalize by number of components (like std) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
743 lengthsq.add(small) # small avoids division by 0 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
744 cmt.sqrt(lengthsq, target = length) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
745 length.reciprocal(target = normcoeff) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
746 data.mult_by_row(normcoeff, target = normdata) # normalized data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
747 ## compute positive sample derivatives |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
748 # covariance part |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
749 cmt.dot(VF.T, normdata, target = feat) # HxP (nr facs x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
750 feat.mult(feat, target = featsq) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
751 cmt.dot(FH.T,featsq, target = t1) # OxP (nr cov hiddens x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
752 t1.mult(-0.5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
753 t1.add_col_vec(bias_cov) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
754 t1.apply_sigmoid(target = t2) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
755 cmt.dot(featsq, t2.T, target = FHinc) # HxO |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
756 cmt.dot(FH,t2, target = t3) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
757 t3.mult(feat) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
758 cmt.dot(normdata, t3.T, target = VFinc) # VxH |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
759 t2.sum(axis = 1, target = bias_covinc) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
760 bias_covinc.mult(-1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
761 # visible bias |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
762 data.sum(axis = 1, target = bias_visinc) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
763 bias_visinc.mult(-1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
764 # mean part |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
765 cmt.dot(w_mean.T, data, target = feat_mean) # HxP (nr mean hiddens x nr samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
766 feat_mean.add_col_vec(bias_mean) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
767 feat_mean.apply_sigmoid() # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
768 feat_mean.mult(-1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
769 cmt.dot(data, feat_mean.T, target = w_meaninc) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
770 feat_mean.sum(axis = 1, target = bias_meaninc) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
771 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
772 # HMC sampling: draw an approximate sample from the model |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
773 if doPCD == 0: # CD-1 (set negative data to current training samples) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
774 hmc_step, hmc_ave_rej = draw_HMC_samples(data,negdata,normdata,vel,gradient,normgradient,new_energy,old_energy,VF,FH,bias_cov,bias_vis,w_mean,bias_mean,hmc_step,hmc_step_nr,hmc_ave_rej,hmc_target_ave_rej,t1,t2,t3,t4,t5,t6,t7,thresh,feat,featsq,batch_size,feat_mean,length,lengthsq,normcoeff,small,num_vis) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
775 else: # PCD-1 (use previous negative data as starting point for chain) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
776 negdataini.assign(negdata) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
777 hmc_step, hmc_ave_rej = draw_HMC_samples(negdataini,negdata,normdata,vel,gradient,normgradient,new_energy,old_energy,VF,FH,bias_cov,bias_vis,w_mean,bias_mean,hmc_step,hmc_step_nr,hmc_ave_rej,hmc_target_ave_rej,t1,t2,t3,t4,t5,t6,t7,thresh,feat,featsq,batch_size,feat_mean,length,lengthsq,normcoeff,small,num_vis) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
778 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
779 # compute derivatives at the negative samples |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
780 # normalize input data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
781 negdata.mult(negdata, target = t6) # DxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
782 t6.sum(axis = 0, target = lengthsq) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
783 lengthsq.mult(1./num_vis) # normalize by number of components (like std) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
784 lengthsq.add(small) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
785 cmt.sqrt(lengthsq, target = length) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
786 length.reciprocal(target = normcoeff) # 1xP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
787 negdata.mult_by_row(normcoeff, target = normdata) # normalized data |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
788 # covariance part |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
789 cmt.dot(VF.T, normdata, target = feat) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
790 feat.mult(feat, target = featsq) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
791 cmt.dot(FH.T,featsq, target = t1) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
792 t1.mult(-0.5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
793 t1.add_col_vec(bias_cov) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
794 t1.apply_sigmoid(target = t2) # OxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
795 FHinc.subtract_dot(featsq, t2.T) # HxO |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
796 FHinc.mult(0.5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
797 cmt.dot(FH,t2, target = t3) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
798 t3.mult(feat) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
799 VFinc.subtract_dot(normdata, t3.T) # VxH |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
800 bias_covinc.add_sums(t2, axis = 1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
801 # visible bias |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
802 bias_visinc.add_sums(negdata, axis = 1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
803 # mean part |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
804 cmt.dot(w_mean.T, negdata, target = feat_mean) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
805 feat_mean.add_col_vec(bias_mean) # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
806 feat_mean.apply_sigmoid() # HxP |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
807 w_meaninc.add_dot(negdata, feat_mean.T) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
808 bias_meaninc.add_sums(feat_mean, axis = 1) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
809 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
810 # update parameters |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
811 VFinc.add_mult(VF.sign(), weightcost) # L1 regularization |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
812 VF.add_mult(VFinc, -epsilonVFc/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
813 # normalize columns of VF: normalize by running average of their norm |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
814 VF.mult(VF, target = t8) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
815 t8.sum(axis = 0, target = t10) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
816 cmt.sqrt(t10) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
817 t10.sum(axis=1,target = t5) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
818 t5.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
819 normVF = .95*normVF + (.05/num_fac) * t5.numpy_array[0,0] # estimate norm |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
820 t10.reciprocal() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
821 VF.mult_by_row(t10) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
822 VF.mult(normVF) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
823 bias_cov.add_mult(bias_covinc, -epsilonbc/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
824 bias_vis.add_mult(bias_visinc, -epsilonbc/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
825 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
826 if epoch > startFH: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
827 FHinc.add_mult(FH.sign(), weightcost) # L1 regularization |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
828 FH.add_mult(FHinc, -epsilonFHc/batch_size) # update |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
829 # set to 0 negative entries in FH |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
830 FH.greater_than(0, target = t9) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
831 FH.mult(t9) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
832 if apply_mask==1: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
833 FH.mult(mask) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
834 # normalize columns of FH: L1 norm set to 1 in each column |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
835 FH.sum(axis = 0, target = t11) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
836 t11.reciprocal() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
837 FH.mult_by_row(t11) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
838 w_meaninc.add_mult(w_mean.sign(),weightcost) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
839 w_mean.add_mult(w_meaninc, -epsilonw_meanc/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
840 bias_mean.add_mult(bias_meaninc, -epsilonb_meanc/batch_size) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
841 |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
842 if verbose == 1: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
843 print "VF: " + '%3.2e' % VF.euclid_norm() + ", DVF: " + '%3.2e' % (VFinc.euclid_norm()*(epsilonVFc/batch_size)) + ", FH: " + '%3.2e' % FH.euclid_norm() + ", DFH: " + '%3.2e' % (FHinc.euclid_norm()*(epsilonFHc/batch_size)) + ", bias_cov: " + '%3.2e' % bias_cov.euclid_norm() + ", Dbias_cov: " + '%3.2e' % (bias_covinc.euclid_norm()*(epsilonbc/batch_size)) + ", bias_vis: " + '%3.2e' % bias_vis.euclid_norm() + ", Dbias_vis: " + '%3.2e' % (bias_visinc.euclid_norm()*(epsilonbc/batch_size)) + ", wm: " + '%3.2e' % w_mean.euclid_norm() + ", Dwm: " + '%3.2e' % (w_meaninc.euclid_norm()*(epsilonw_meanc/batch_size)) + ", bm: " + '%3.2e' % bias_mean.euclid_norm() + ", Dbm: " + '%3.2e' % (bias_meaninc.euclid_norm()*(epsilonb_meanc/batch_size)) + ", step: " + '%3.2e' % hmc_step + ", rej: " + '%3.2e' % hmc_ave_rej |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
844 sys.stdout.flush() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
845 # back-up every once in a while |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
846 if np.mod(epoch,10) == 0: |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
847 VF.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
848 FH.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
849 bias_cov.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
850 w_mean.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
851 bias_mean.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
852 bias_vis.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
853 savemat("ws_temp", {'VF':VF.numpy_array,'FH':FH.numpy_array,'bias_cov': bias_cov.numpy_array, 'bias_vis': bias_vis.numpy_array,'w_mean': w_mean.numpy_array, 'bias_mean': bias_mean.numpy_array, 'epoch':epoch}) |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
854 # final back-up |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
855 VF.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
856 FH.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
857 bias_cov.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
858 bias_vis.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
859 w_mean.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
860 bias_mean.copy_to_host() |
90e11d5d0a41
adding algorithms/mcRBM, but it is not done yet
James Bergstra <bergstrj@iro.umontreal.ca>
parents:
diff
changeset
|
861 savemat("ws_fac" + str(num_fac) + "_cov" + str(num_hid_cov) + "_mean" + str(num_hid_mean), {'VF':VF.numpy_array,'FH':FH.numpy_array,'bias_cov': bias_cov.numpy_array, 'bias_vis': bias_vis.numpy_array, 'w_mean': w_mean.numpy_array, 'bias_mean': bias_mean.numpy_array, 'epoch':epoch}) |