annotate transformations/filetensor.py @ 41:fdb0e0870fb4

Beaucoup de modifications à pipeline.py pour généraliser et un début de visualisation, et créé un wrapper (run_pipeline.py) pour appeler avec GIMP. - Modifications à pipeline.py - Wrappé la boucle du pipeline dans une classe - Isolé le problème de itérer sur les batches et les complexités dans des itérateurs - Permet d'avoir des ordres compliqués de batch (plusieurs sources), de complexités - Maintenant regenerate_parameters() est appelé pour chaque image. - Command line arguments avec getopt(). On pourra rajouter des options ainsi. - run_pipeline.py - Le but est de permettre de passer des arguments. Pas facile (pas trouvé comment de façon simple) avec la command line pour appeler GIMP en mode batch. C'est un hack ici. - Le but ultime est de permettre de lancer les jobs sur les clusters avec dbidispatch en précisant les options (diff. pour chaque job) sur la ligne de commande.
author fsavard
date Wed, 03 Feb 2010 17:08:27 -0500
parents faacc76d21c2
children
rev   line source
10
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
1 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
2 Read and write the matrix file format described at
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
3 U{http://www.cs.nyu.edu/~ylclab/data/norb-v1.0/index.html}
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
4
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
5 The format is for dense tensors:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
6
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
7 - magic number indicating type and endianness - 4bytes
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
8 - rank of tensor - int32
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
9 - dimensions - int32, int32, int32, ...
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
10 - <data>
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
11
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
12 The number of dimensions and rank is slightly tricky:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
13 - for scalar: rank=0, dimensions = [1, 1, 1]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
14 - for vector: rank=1, dimensions = [?, 1, 1]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
15 - for matrix: rank=2, dimensions = [?, ?, 1]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
16
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
17 For rank >= 3, the number of dimensions matches the rank exactly.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
18
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
19
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
20 @todo: add complex type support
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
21
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
22 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
23 import sys
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
24 import numpy
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
25
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
26 def _prod(lst):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
27 p = 1
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
28 for l in lst:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
29 p *= l
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
30 return p
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
31
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
32 _magic_dtype = {
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
33 0x1E3D4C51 : ('float32', 4),
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
34 #0x1E3D4C52 : ('packed matrix', 0), #what is a packed matrix?
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
35 0x1E3D4C53 : ('float64', 8),
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
36 0x1E3D4C54 : ('int32', 4),
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
37 0x1E3D4C55 : ('uint8', 1),
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
38 0x1E3D4C56 : ('int16', 2),
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
39 }
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
40 _dtype_magic = {
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
41 'float32': 0x1E3D4C51,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
42 #'packed matrix': 0x1E3D4C52,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
43 'float64': 0x1E3D4C53,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
44 'int32': 0x1E3D4C54,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
45 'uint8': 0x1E3D4C55,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
46 'int16': 0x1E3D4C56
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
47 }
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
48
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
49 def _read_int32(f):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
50 """unpack a 4-byte integer from the current position in file f"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
51 s = f.read(4)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
52 s_array = numpy.fromstring(s, dtype='int32')
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
53 return s_array.item()
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
54
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
55 def _read_header(f, debug=False):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
56 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
57 :returns: data type, element size, rank, shape, size
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
58 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
59 #what is the data type of this matrix?
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
60 #magic_s = f.read(4)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
61 #magic = numpy.fromstring(magic_s, dtype='int32')
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
62 magic = _read_int32(f)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
63 magic_t, elsize = _magic_dtype[magic]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
64 if debug:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
65 print 'header magic', magic, magic_t, elsize
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
66 if magic_t == 'packed matrix':
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
67 raise NotImplementedError('packed matrix not supported')
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
68
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
69 #what is the rank of the tensor?
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
70 ndim = _read_int32(f)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
71 if debug: print 'header ndim', ndim
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
72
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
73 #what are the dimensions of the tensor?
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
74 dim = numpy.fromfile(f, dtype='int32', count=max(ndim,3))[:ndim]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
75 dim_size = _prod(dim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
76 if debug: print 'header dim', dim, dim_size
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
77
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
78 return magic_t, elsize, ndim, dim, dim_size
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
79
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
80 class arraylike(object):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
81 """Provide an array-like interface to the filetensor in f.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
82
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
83 The rank parameter to __init__ controls how this object interprets the underlying tensor.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
84 Its behaviour should be clear from the following example.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
85 Suppose the underlying tensor is MxNxK.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
86
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
87 - If rank is 0, self[i] will be a scalar and len(self) == M*N*K.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
88
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
89 - If rank is 1, self[i] is a vector of length K, and len(self) == M*N.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
90
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
91 - If rank is 3, self[i] is a 3D tensor of size MxNxK, and len(self)==1.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
92
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
93 - If rank is 5, self[i] is a 5D tensor of size 1x1xMxNxK, and len(self) == 1.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
94
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
95
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
96 :note: Objects of this class generally require exclusive use of the underlying file handle, because
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
97 they call seek() every time you access an element.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
98 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
99
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
100 f = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
101 """File-like object"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
102
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
103 magic_t = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
104 """numpy data type of array"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
105
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
106 elsize = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
107 """number of bytes per scalar element"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
108
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
109 ndim = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
110 """Rank of underlying tensor"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
111
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
112 dim = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
113 """tuple of array dimensions (aka shape)"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
114
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
115 dim_size = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
116 """number of scalars in the tensor (prod of dim)"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
117
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
118 f_start = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
119 """The file position of the first element of the tensor"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
120
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
121 readshape = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
122 """tuple of array dimensions of the block that we read"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
123
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
124 readsize = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
125 """number of elements we must read for each block"""
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
126
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
127 def __init__(self, f, rank=0, debug=False):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
128 self.f = f
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
129 self.magic_t, self.elsize, self.ndim, self.dim, self.dim_size = _read_header(f,debug)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
130 self.f_start = f.tell()
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
131
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
132 if rank <= self.ndim:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
133 self.readshape = tuple(self.dim[self.ndim-rank:])
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
134 else:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
135 self.readshape = tuple(self.dim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
136
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
137 #self.readshape = tuple(self.dim[self.ndim-rank:]) if rank <= self.ndim else tuple(self.dim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
138
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
139 if rank <= self.ndim:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
140 padding = tuple()
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
141 else:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
142 padding = (1,) * (rank - self.ndim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
143
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
144 #padding = tuple() if rank <= self.ndim else (1,) * (rank - self.ndim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
145 self.returnshape = padding + self.readshape
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
146 self.readsize = _prod(self.readshape)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
147 if debug: print 'READ PARAM', self.readshape, self.returnshape, self.readsize
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
148
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
149 def __len__(self):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
150 return _prod(self.dim[:self.ndim-len(self.readshape)])
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
151
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
152 def __getitem__(self, idx):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
153 if idx >= len(self):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
154 raise IndexError(idx)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
155 self.f.seek(self.f_start + idx * self.elsize * self.readsize)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
156 return numpy.fromfile(self.f,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
157 dtype=self.magic_t,
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
158 count=self.readsize).reshape(self.returnshape)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
159
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
160
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
161 #
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
162 # TODO: implement item selection:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
163 # e.g. load('some mat', subtensor=(:6, 2:5))
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
164 #
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
165 # This function should be memory efficient by:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
166 # - allocating an output matrix at the beginning
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
167 # - seeking through the file, reading subtensors from multiple places
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
168 def read(f, subtensor=None, debug=False):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
169 """Load all or part of file 'f' into a numpy ndarray
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
170
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
171 @param f: file from which to read
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
172 @type f: file-like object
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
173
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
174 If subtensor is not None, it should be like the argument to
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
175 numpy.ndarray.__getitem__. The following two expressions should return
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
176 equivalent ndarray objects, but the one on the left may be faster and more
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
177 memory efficient if the underlying file f is big.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
178
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
179 read(f, subtensor) <===> read(f)[*subtensor]
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
180
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
181 Support for subtensors is currently spotty, so check the code to see if your
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
182 particular type of subtensor is supported.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
183
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
184 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
185 magic_t, elsize, ndim, dim, dim_size = _read_header(f,debug)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
186 f_start = f.tell()
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
187
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
188 rval = None
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
189 if subtensor is None:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
190 rval = numpy.fromfile(f, dtype=magic_t, count=_prod(dim)).reshape(dim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
191 elif isinstance(subtensor, slice):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
192 if subtensor.step not in (None, 1):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
193 raise NotImplementedError('slice with step', subtensor.step)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
194 if subtensor.start not in (None, 0):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
195 bytes_per_row = _prod(dim[1:]) * elsize
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
196 f.seek(f_start + subtensor.start * bytes_per_row)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
197 dim[0] = min(dim[0], subtensor.stop) - subtensor.start
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
198 rval = numpy.fromfile(f, dtype=magic_t, count=_prod(dim)).reshape(dim)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
199 else:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
200 raise NotImplementedError('subtensor access not written yet:', subtensor)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
201
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
202 return rval
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
203
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
204 def write(f, mat):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
205 """Write a numpy.ndarray to file.
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
206
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
207 @param f: file into which to write
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
208 @type f: file-like object
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
209
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
210 @param mat: array to write to file
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
211 @type mat: numpy ndarray or compatible
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
212
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
213 """
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
214 def _write_int32(f, i):
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
215 i_array = numpy.asarray(i, dtype='int32')
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
216 if 0: print 'writing int32', i, i_array
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
217 i_array.tofile(f)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
218
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
219 try:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
220 _write_int32(f, _dtype_magic[str(mat.dtype)])
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
221 except KeyError:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
222 raise TypeError('Invalid ndarray dtype for filetensor format', mat.dtype)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
223
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
224 _write_int32(f, len(mat.shape))
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
225 shape = mat.shape
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
226 if len(shape) < 3:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
227 shape = list(shape) + [1] * (3 - len(shape))
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
228 if 0: print 'writing shape =', shape
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
229 for sh in shape:
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
230 _write_int32(f, sh)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
231 mat.tofile(f)
faacc76d21c2 Basic new pipeline script for the images tranforms
Arnaud Bergeron <abergeron@gmail.com>
parents:
diff changeset
232