annotate onehotop.py.scalar @ 469:4335309f4924

Split into preprocess for words and sequences
author Joseph Turian <turian@iro.umontreal.ca>
date Tue, 21 Oct 2008 16:32:06 -0400
parents 18702ceb2096
children
rev   line source
356
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
1 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
2 One hot Op
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
3 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
4
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
5 #from theano import tensor
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
6 from theano.tensor import as_tensor, Tensor
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
7 #from theano import scalar
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
8 from theano.scalar import as_scalar
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
9 from theano.gof import op
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
10 from theano.gof.graph import Apply
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
11
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
12 import numpy
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
13
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
14 class OneHot(op.Op):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
15 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
16 Construct a one-hot vector, x out of y.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
17
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
18 @todo: Document inputs and outputs
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
19 @todo: Use 'bool' as output dtype? Or, at least 'int64' ? Not float64!
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
20 @todo: Use 'bool' as output dtype, not 'int64' ?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
21 @todo: Allow this to operate on column vectors (Tensor)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
22 @todo: Describe better.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
23 @todo: What type is y?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
24 @todo: What about operating on L{Scalar}s?
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
25 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
26
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
27 def make_node(self, x, y):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
28 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
29 @type x: Vector L{Tensor} of integers
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
30 @param x: The entries of the one-hot vector to be one.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
31 @type y: Integer L{Scalar}
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
32 @param y: The length (#columns) of the one-hot vectors.
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
33 @return: A L{Tensor} of one-hot vectors
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
34
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
35 @precondition: x < y for all entries of x
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
36 @todo: Check that x and y are int types
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
37 """
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
38 #x = tensor.as_tensor(x)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
39 #y = scalar.as_scalar(y)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
40 x = as_tensor(x)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
41 y = as_scalar(y)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
42 #assert x.dtype[0:3] == "int"
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
43 #assert y.dtype[0:3] == "int"
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
44 inputs = [x, y]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
45 ##outputs = [tensor.Tensor("int64", broadcastable=[False, False])]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
46 #outputs = [tensor.Tensor("float64", broadcastable=[False, False])]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
47 #outputs = [Tensor("int64", broadcastable=[False, False])]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
48 outputs = [Tensor("float64", broadcastable=[False, False]).make_result()]
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
49 node = Apply(op = self, inputs = inputs, outputs = outputs)
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
50 return node
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
51
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
52 def perform(self, node, (x, y), (out, )):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
53 assert x.dtype == "int64"
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
54 assert type(y) == numpy.int64
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
55 assert x.ndim == 1
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
56 #out = numpy.zeros((x.shape[0], y), dtype="int64")
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
57 out[0] = numpy.zeros((x.shape[0], y), dtype="float64")
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
58 for c in range(x.shape[0]):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
59 assert x[c] < y
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
60 out[0][c, x[c]] = 1
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
61
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
62 def grad(self, (x, y), (out_gradient, )):
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
63 return None, None
18702ceb2096 Added more functions
Joseph Turian <turian@iro.umontreal.ca>
parents:
diff changeset
64 one_hot = OneHot()