Mercurial > ift6266

--- a/writeup/nips2010_submission.tex	Thu Jun 03 09:18:02 2010 -0400
+++ b/writeup/nips2010_submission.tex	Thu Jun 03 11:02:39 2010 -0400
@@ -68,7 +68,7 @@

 Self-taught learning~\citep{RainaR2007} is a paradigm that combines principles
 of semi-supervised and multi-task learning: the learner can exploit examples
-that are unlabeled and/or come from a distribution different from the target
+that are unlabeled and possibly come from a distribution different from the target
 distribution, e.g., from other classes than those of interest.
 It has already been shown that deep learners can clearly take advantage of
 unsupervised learning and unlabeled examples~\citep{Bengio-2009,WestonJ2008-small},
@@ -129,7 +129,7 @@
 %\end{minipage}%
 %\hspace{0.3cm}\begin{minipage}[b]{0.86\linewidth}
 This section describes the different transformations we used to stochastically
-transform source images such as the one on the left
+transform $32 \times 32$ source images (such as the one on the left)
 in order to obtain data from a larger distribution which
 covers a domain substantially larger than the clean characters distribution from
 which we start.
@@ -176,7 +176,7 @@
 element from a subset of the $n=round(m \times complexity)$ smallest structuring elements
 where $m=10$ for dilation and $m=6$ for erosion (to avoid completely erasing thin characters).
 A neutral element (no transformation)
-is always present in the set. is applied.
+is always present in the set.
 %\vspace{.4cm}
 %\end{minipage}
 %\vspace{-.7cm}
@@ -186,13 +186,14 @@
 \includegraphics[scale=.4]{images/Slant_only.png}\\
 {\bf Slant}
 \end{minipage}%
-\hspace{0.3cm}\begin{minipage}[b]{0.83\linewidth}
+\hspace{0.3cm}
+\begin{minipage}[b]{0.83\linewidth}
 %\centering
-%\vspace*{-15mm}
 To produce {\bf slant}, each row of the image is shifted
 proportionally to its height: $shift = round(slant \times height)$.
 $slant \sim U[-complexity,complexity]$.
-\vspace{1.5cm}
+The shift is randomly chosen to be either to the left or to the right.
+\vspace{1.1cm}
 \end{minipage}
 %\vspace*{-4mm}

@@ -213,10 +214,10 @@
 nearest to $(ax+by+c,dx+ey+f)$,
 producing scaling, translation, rotation and shearing.
 Marginal distributions of $(a,b,c,d,e,f)$ have been tuned to
-forbid large rotations (not to confuse classes) but to give good
+forbid large rotations (to avoid confusing classes) but to give good
 variability of the transformation: $a$ and $d$ $\sim U[1-3
-complexity,1+3\,complexity]$, $b$ and $e$ $\sim[-3 \,complexity,3\,
-complexity]$ and $c$ and $f$ $\sim U[-4 \,complexity, 4 \,
+complexity,1+3\,complexity]$, $b$ and $e$ $\sim U[-3 \,complexity,3\,
+complexity]$, and $c$ and $f \sim U[-4 \,complexity, 4 \,
 complexity]$.\\
 %\end{minipage}

@@ -259,15 +260,16 @@
 %\vspace{.6cm}
 %\end{minipage}%
 %\hspace{0.3cm}\begin{minipage}[b]{0.86\linewidth}
-The {\bf pinch} module applies the ``Whirl and pinch'' GIMP filter with whirl was set to 0.
+The {\bf pinch} module applies the ``Whirl and pinch'' GIMP filter with whirl set to 0.
 A pinch is ``similar to projecting the image onto an elastic
 surface and pressing or pulling on the center of the surface'' (GIMP documentation manual).
 For a square input image, draw a radius-$r$ disk
-around $C$. Any pixel $P$ belonging to
+around its center $C$. Any pixel $P$ belonging to
 that disk has its value replaced by
 the value of a ``source'' pixel in the original image,
 on the line that goes through $C$ and $P$, but
-at some other distance $d_2$. Define $d_1=distance(P,C) = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times
+at some other distance $d_2$. Define $d_1=distance(P,C)$
+and $d_2 = sin(\frac{\pi{}d_1}{2r})^{-pinch} \times
 d_1$, where $pinch$ is a parameter of the filter.
 The actual value is given by bilinear interpolation considering the pixels
 around the (non-integer) source position thus found.
@@ -310,8 +312,9 @@
 \vspace*{-18mm}
 The {\bf occlusion} module selects a random rectangle from an {\em occluder} character
 image and places it over the original {\em occluded}
-image. Pixels are combined by taking the max(occluder,occluded),
-closer to black. The rectangle corners
+image. Pixels are combined by taking the max(occluder, occluded),
+i.e. keeping the lighter ones.
+The rectangle corners
 are sampled so that larger complexity gives larger rectangles.
 The destination position in the occluded image are also sampled
 according to a normal distribution (more details in~\citet{ift6266-tr-anonymous}).
@@ -334,18 +337,19 @@
 %\end{minipage}%
 %\hspace{0.3cm}\begin{minipage}[t]{0.86\linewidth}
 With the {\bf Gaussian smoothing} module,
-different regions of the image are spatially smoothed by convolving
-the image with a symmetric Gaussian kernel of
+different regions of the image are spatially smoothed.
+This is achieved  by first convolving
+the image with an isotropic Gaussian kernel of
 size and variance chosen uniformly in the ranges $[12,12 + 20 \times
-complexity]$ and $[2,2 + 6 \times complexity]$. The result is normalized
-between $0$ and $1$.  We also create a symmetric weighted averaging window, of the
+complexity]$ and $[2,2 + 6 \times complexity]$. This filtered image is normalized
+between $0$ and $1$.  We also create an isotropic weighted averaging window, of the
 kernel size, with maximum value at the center.  For each image we sample
 uniformly from $3$ to $3 + 10 \times complexity$ pixels that will be
 averaging centers between the original image and the filtered one.  We
 initialize to zero a mask matrix of the image size. For each selected pixel
-we add to the mask the averaging window centered to it.  The final image is
-computed from the following element-wise operation: $\frac{image + filtered
-  image \times mask}{mask+1}$.
+we add to the mask the averaging window centered on it.  The final image is
+computed from the following element-wise operation: $\frac{image + filtered\_image
+\times mask}{mask+1}$.
 This module is skipped with probability 75\%.
 %\end{minipage}

@@ -366,9 +370,10 @@
 %\end{minipage}%
 %\hspace{-0cm}\begin{minipage}[t]{0.86\linewidth}
 %\vspace*{-20mm}
-This module {\bf permutes neighbouring pixels}. It first selects
-fraction $\frac{complexity}{3}$ of pixels randomly in the image. Each of them are then
-sequentially exchanged with one other in as $V4$ neighbourhood.
+This module {\bf permutes neighbouring pixels}. It first selects a
+fraction $\frac{complexity}{3}$ of pixels randomly in the image. Each
+of these pixels is then sequentially exchanged with a random pixel
+among its four nearest neighbors (on its left, right, top or bottom).
 This module is skipped with probability 80\%.\\
 \vspace*{1mm}
 \end{minipage}
@@ -455,7 +460,7 @@
 of applying 1, 2, or 3 patches are (50\%,30\%,20\%).
 \end{minipage}

-\vspace*{2mm}
+\vspace*{1mm}

 \begin{minipage}[t]{0.25\linewidth}
 \centering
@@ -463,7 +468,7 @@
 {\bf Grey Level \& Contrast}
 \end{minipage}%
 \hspace{-12mm}\begin{minipage}[t]{0.82\linewidth}
-t -m "\vspace*{-18mm}
+\vspace*{-18mm}
 The {\bf grey level and contrast} module changes the contrast by changing grey levels, and may invert the image polarity (white
 to black and black to white). The contrast is $C \sim U[1-0.85 \times complexity,1]$
 so the image is normalized into $[\frac{1-C}{2},1-\frac{1-C}{2}]$. The
@@ -486,8 +491,7 @@
 \end{figure}
 \fi

-
-\vspace*{-2mm}
+\vspace*{-3mm}
 \section{Experimental Setup}
 \vspace*{-1mm}