ift6266: writeup/techreport.tex comparison

comparison writeup/techreport.tex @ 477:534d4ecf1bd1

small desription of the font added

author	Xavier Glorot <glorotxa@iro.umontreal.ca>
date	Sun, 30 May 2010 17:24:26 -0400
parents	5fa1c653620c
children	6593e67381a3

comparison

equal deleted inserted replaced

-:db28764b8252
+:534d4ecf1bd1
 The performances reported by previous work on that dataset mostly use only the digits.
 Here we use the whole classes both in the training and testing phase.
 \item {\bf Fonts}
+In order to have a good variety of sources we downloaded an important number of free fonts from: {\tt http://anonymous.url.net}
+%real adress {\tt http://cg.scs.carleton.ca/~luc/freefonts.html}
+in addition to Windows 7's, this adds up to a total of $9817$ different fonts that we can choose uniformly.
+The ttf file is either used as input of the Captcha generator (see next item) or, by producing a corresponding image,
+directly as input to our models.
+%Guillaume are there other details I forgot on the font selection?
 \item {\bf Captchas}
 The Captcha data source is an adaptation of the \emph{pycaptcha} library (a python based captcha generator library) for
 generating characters of the same format as the NIST dataset. The core of this data source is composed with a random character
 generator and various kinds of tranformations similar to those described in the previous sections.
 In order to increase the variability of the data generated, different fonts are used for generating the characters.
 \item {\bf OCR data}
 \end{itemize}
 \subsubsection{Data Sets}
 \begin{itemize}
-\item {\bf NIST}
 \item {\bf P07}
 The dataset P07 is sampled with our transformation pipeline with a complexity parameter of $0.7$.
 For each new exemple to generate, we choose one source with the following probability: $0.1$ for the fonts,
 $0.25$ for the captchas, $0.25$ for OCR data and $0.4$ for NIST. We apply all the transformations in their order
 and for each of them we sample uniformly a complexity in the range $[0,0.7]$.

Mercurial > ift6266

comparison writeup/techreport.tex @ 477:534d4ecf1bd1