annotate random_transformation.py @ 455:fb62f0e4bcfe

Reverted change ce6b4fd3ab29 (I do not believe anymore it was a typo)
author delallea@valhalla.apstat.com
date Thu, 02 Oct 2008 13:46:13 -0400
parents 18702ceb2096
children
rev   line source
356
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
1 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
2 New L{Op}s that aren't in core theano
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
3 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
4
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
5 from theano import sparse
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
6 from theano import tensor
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
7 from theano import scalar
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
8 from theano.gof import op
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
9
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
10 from theano.sparse import _is_dense, _is_sparse, _is_dense_result, _is_sparse_result
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
11
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
12 import scipy.sparse
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
13
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
14 import numpy
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
15
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
16 class RowRandomTransformation(op.Op):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
17 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
18 Given C{x}, a (sparse) matrix with shape (exmpls, dimensions), we
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
19 multiply it by a deterministic random matrix of shape (dimensions,
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
20 length) to obtain random transformation output of shape (exmpls,
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
21 length).
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
22
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
23 Each element of the deterministic random matrix is selected uniformly
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
24 from [-1, +1).
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
25 @todo: Use another random distribution?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
26
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
27 @note: This function should be written such that if length is
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
28 increased, we obtain the same results (except longer). Similarly,
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
29 the rows should be able to be permuted and get the same result in
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
30 the same fashion.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
31
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
32 @todo: This may be slow?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
33 @todo: Rewrite for dense matrices too?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
34 @todo: Is there any way to verify the convention that each row is
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
35 an example? Should I rename the variables in the code to make the
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
36 semantics more explicit?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
37 @todo: AUTOTEST: Autotest that dense and spare versions of this are identical.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
38 @todo: Rename? Is Row the correct name? Maybe column-wise?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
39
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
40 @type x: L{scipy.sparse.spmatrix}
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
41 @param x: Sparse matrix to be randomly transformed with shape (exmpls, dimensions)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
42 @type length: int
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
43 @param length: The number of transformations of C{x} to be performed.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
44 @param initial_seed: Initial seed for the RNG.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
45 @rtype: L{numpy.ndarray}
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
46 @return: Array with C{length} random transformations, with shape (exmpls, length)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
47 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
48
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
49 import random
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
50 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
51 RNG used for random transformations.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
52 Does not share state with rest of program.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
53 @todo: Make STATIC and private. Ask James or Olivier how to make this more Pythonic.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
54 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
55 _trng = random.Random()
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
56
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
57 def __init__(self, x, length, initial_seed=0, **kwargs):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
58 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
59 @todo: Which broadcastable values should I use?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
60 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
61 assert 0 # Needs to be updated to Olivier's new Op creation approach
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
62 op.Op.__init__(self, **kwargs)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
63 x = sparse.as_sparse(x)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
64 self.initial_seed = initial_seed
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
65 self.length = length
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
66 self.inputs = [x]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
67 self.outputs = [tensor.Tensor(x.dtype, broadcastable=[False, False])]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
68 # self.outputs = [tensor.Tensor(x.dtype, broadcastable=[True, True])]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
69
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
70 def _random_matrix_value(self, row, col, rows):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
71 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
72 From a deterministic random matrix, find one element.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
73 @param row: The row of the element to be read.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
74 @param col: The column of the element to be read.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
75 @param row: The number of rows in the matrix.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
76 @type row: int
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
77 @type col: int
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
78 @type rows: int
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
79 @note: This function is designed such that if we extend
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
80 the number of columns in the random matrix, the values of
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
81 the earlier entries is unchanged.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
82 @todo: Make this static
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
83 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
84 # Choose the random entry at (l, c)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
85 rngidx = col * rows + row
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
86 # Set the random number state for this random entry
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
87 # Note: This may be slow
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
88 self._trng.seed(rngidx + self.initial_seed)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
89
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
90 # Determine the value for this entry
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
91 val = self._trng.uniform(-1, +1)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
92 # print "Exmpl #%d, dimension #%d => Random projection #%d has idx %d (+ seed %d) and value %f" % (r, c, j, rngidx, self.initial_seed, val)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
93 return val
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
94
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
95 def impl(self, xorig):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
96 assert _is_sparse(xorig)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
97 assert len(xorig.shape) == 2
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
98 # Since conversions to and from the COO format are quite fast, you
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
99 # can use this approach to efficiently implement lots computations
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
100 # on sparse matrices.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
101 x = xorig.tocoo()
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
102 (rows, cols) = x.shape
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
103 tot = rows * cols
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
104 out = numpy.zeros((rows, self.length))
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
105 # print "l = %d" % self.length
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
106 # print "x.getnnz() = %d" % x.getnnz()
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
107 all = zip(x.col, x.row, x.data)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
108 all.sort() # TODO: Maybe this is very slow?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
109 lastc = None
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
110 lastl = None
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
111 lastval = None
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
112 for l in range(self.length):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
113 for (c, r, data) in all:
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
114 assert c < cols
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
115 assert r < rows
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
116 if not c == lastc or not l == lastl:
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
117 lastc = c
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
118 lastl = l
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
119 lastval = self._random_matrix_value(c, l, cols)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
120 val = lastval
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
121 # val = self._random_matrix_value(c, l, cols)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
122 # val = self._trng.uniform(-1, +1)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
123 # val = 1.0
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
124 out[r][l] += val * data
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
125 return out
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
126 def __copy__(self):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
127 return self.__class__(self.inputs[0], self.length, self.initial_seed)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
128 def clone_with_new_inputs(self, *new_inputs):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
129 return self.__class__(new_inputs[0], self.length, self.initial_seed)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
130 def desc(self, *new_inputs):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
131 return (self.__class__, self.length, self.initial_seed)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
132 row_random_transformation = RowRandomTransformation()