Mercurial > ift6266
diff writeup/techreport.tex @ 477:534d4ecf1bd1
small desription of the font added
author | Xavier Glorot <glorotxa@iro.umontreal.ca> |
---|---|
date | Sun, 30 May 2010 17:24:26 -0400 |
parents | 5fa1c653620c |
children | 6593e67381a3 |
line wrap: on
line diff
--- a/writeup/techreport.tex Sun May 30 12:06:45 2010 -0400 +++ b/writeup/techreport.tex Sun May 30 17:24:26 2010 -0400 @@ -429,6 +429,13 @@ \item {\bf Fonts} +In order to have a good variety of sources we downloaded an important number of free fonts from: {\tt http://anonymous.url.net} +%real adress {\tt http://cg.scs.carleton.ca/~luc/freefonts.html} +in addition to Windows 7's, this adds up to a total of $9817$ different fonts that we can choose uniformly. +The ttf file is either used as input of the Captcha generator (see next item) or, by producing a corresponding image, +directly as input to our models. +%Guillaume are there other details I forgot on the font selection? + \item {\bf Captchas} The Captcha data source is an adaptation of the \emph{pycaptcha} library (a python based captcha generator library) for generating characters of the same format as the NIST dataset. The core of this data source is composed with a random character @@ -442,7 +449,6 @@ \subsubsection{Data Sets} \begin{itemize} -\item {\bf NIST} \item {\bf P07} The dataset P07 is sampled with our transformation pipeline with a complexity parameter of $0.7$. For each new exemple to generate, we choose one source with the following probability: $0.1$ for the fonts,