Mercurial > ift6266
comparison writeup/techreport.tex @ 416:5f9d04dda707
Correction d'une erreur pour pinch et ajout d'une ref bibliographique
author | fsavard |
---|---|
date | Thu, 29 Apr 2010 18:26:30 -0400 |
parents | 1e9788ce1680 |
children | 0282882aa91f |
comparison
equal
deleted
inserted
replaced
415:1e9788ce1680 | 416:5f9d04dda707 |
---|---|
91 This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied | 91 This allows to produce scaling, translation, rotation and shearing variances. We took care that the maximum rotation applied |
92 to the image is low enough not to confuse classes. | 92 to the image is low enough not to confuse classes. |
93 | 93 |
94 \subsection{Local Elastic Deformations} | 94 \subsection{Local Elastic Deformations} |
95 | 95 |
96 This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in . | 96 This filter induces a "wiggly" effect in the image. The description here will be brief, as the algorithm follows precisely what is described in \cite{SimardSP03}. |
97 | 97 |
98 The general idea is to generate two "displacements" fields, for horizontal and vertical displacements of pixels. Each of these fields has the same size as the original image. | 98 The general idea is to generate two "displacements" fields, for horizontal and vertical displacements of pixels. Each of these fields has the same size as the original image. |
99 | 99 |
100 When generating the transformed image, we'll loop over the x and y positions in the fields and select, as a value, the value of the pixel in the original image at the (relative) position given by the displacement fields for this x and y. If the position we'd retrieve is outside the borders of the image, we use a 0 value instead. | 100 When generating the transformed image, we'll loop over the x and y positions in the fields and select, as a value, the value of the pixel in the original image at the (relative) position given by the displacement fields for this x and y. If the position we'd retrieve is outside the borders of the image, we use a 0 value instead. |
101 | 101 |
117 | 117 |
118 \subsection{Pinch} | 118 \subsection{Pinch} |
119 | 119 |
120 This is another GIMP filter we used. The filter is in fact named "Whirl and pinch", but we don't use the "whirl" part (whirl is set to 0). As described in GIMP, a pinch is "similar to projecting the image onto an elastic surface and pressing or pulling on the center of the surface". | 120 This is another GIMP filter we used. The filter is in fact named "Whirl and pinch", but we don't use the "whirl" part (whirl is set to 0). As described in GIMP, a pinch is "similar to projecting the image onto an elastic surface and pressing or pulling on the center of the surface". |
121 | 121 |
122 Mathematically, think of drawing a circle of radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to that disk (region inside circle) will have its value recalculated by taking the value of another "source" pixel in the original image. The position of that source pixel is found on the line thats goes through $C$ and $P$, but at some other distance $d_2$. Define $d_1$ to be the distance between $P$ and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch}$, where $pinch$ is a parameter to the filter. | 122 Mathematically, for a square input image, think of drawing a circle of radius $r$ around a center point $C$. Any point (pixel) $P$ belonging to that disk (region inside circle) will have its value recalculated by taking the value of another "source" pixel in the original image. The position of that source pixel is found on the line thats goes through $C$ and $P$, but at some other distance $d_2$. Define $d_1$ to be the distance between $P$ and $C$. $d_2$ is given by $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times d_1$, where $pinch$ is a parameter to the filter. |
123 | |
124 If the image is not square | |
123 | 125 |
124 The actual value is given by bilinear interpolation considering the pixels around the (non-integer) source position. | 126 The actual value is given by bilinear interpolation considering the pixels around the (non-integer) source position. |
125 | 127 |
126 The value for $pinch$ in our case was given by sampling from an uniform distribution over the range $[-complexity, 0.7 \times complexity]$. | 128 The value for $pinch$ in our case was given by sampling from an uniform distribution over the range $[-complexity, 0.7 \times complexity]$. |
127 | 129 |
134 | 136 |
135 These four sizes collectively define a window centered on the middle pixel of the occlusive image. This is the part that will be extracted as the occlusion. | 137 These four sizes collectively define a window centered on the middle pixel of the occlusive image. This is the part that will be extracted as the occlusion. |
136 | 138 |
137 The next step is to select a destination position in the occluded image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$ are selected according to Gaussian distributions of mean 0 and of standard deviations of, respectively, 3 and 2. Then an horizontal placement mode, $endroit$ (meaning location), is selected to be of three values meaning left, middle or right. | 139 The next step is to select a destination position in the occluded image. Vertical and horizontal displacements $y\_arrivee$ and $x\_arrivee$ are selected according to Gaussian distributions of mean 0 and of standard deviations of, respectively, 3 and 2. Then an horizontal placement mode, $endroit$ (meaning location), is selected to be of three values meaning left, middle or right. |
138 | 140 |
139 If $endroit$ is "middle", the occlusion will be horizontally centered around the horizontal middle of the occluded image, then shifted according to $x_\arrivee$. If $endroit$ is "left", it will be placed on the left of the occluded image, then displaced right according to $x_\arrivee$. The contrary happens if $endroit$ is $right$. | 141 If $endroit$ is "middle", the occlusion will be horizontally centered around the horizontal middle of the occluded image, then shifted according to $x\_arrivee$. If $endroit$ is "left", it will be placed on the left of the occluded image, then displaced right according to $x\_arrivee$. The contrary happens if $endroit$ is $right$. |
140 | 142 |
141 In both the horizontal and vertical positionning, the maximum position in either direction is such that the selected occlusion won't go beyond the borders of the occluded image. | 143 In both the horizontal and vertical positionning, the maximum position in either direction is such that the selected occlusion won't go beyond the borders of the occluded image. |
142 | 144 |
143 This filter has a probability of not being applied, at all, of 60%. | 145 This filter has a probability of not being applied, at all, of 60\%. |
144 | 146 |
145 \subsection{Background Images} | 147 \subsection{Background Images} |
146 | 148 |
147 This transformation adds a random background behind the letter. The background is chosen by first selecting, at random, an image from a set of images. Then we choose a 32x32 subregion of that image as the background image (by sampling x and y positions uniformly while making sure not to cross image borders). | 149 This transformation adds a random background behind the letter. The background is chosen by first selecting, at random, an image from a set of images. Then we choose a 32x32 subregion of that image as the background image (by sampling x and y positions uniformly while making sure not to cross image borders). |
148 | 150 |