# HG changeset patch
# User goldfinger
# Date 1272870514 14400
# Node ID 310c730516af5517058f2d6ac12ca0a175fd29e5
# Parent  858ee3c7649784825d6b3c3b66bf6ff1ff7fa6bf
added description of nist19 and captcha data sources

diff -r 858ee3c76497 -r 310c730516af writeup/techreport.tex
--- a/writeup/techreport.tex	Mon May 03 02:44:11 2010 -0400
+++ b/writeup/techreport.tex	Mon May 03 03:08:34 2010 -0400
@@ -248,12 +248,13 @@
 
 \begin{itemize}
 \item {\bf NIST}
-The NIST Special Database 19 (NIST19) \ref{Grother} is a very widely used dataset for training and testing OCR systems. The dataset is 
-composed with over 800 000 digits and characters (upper and lower cases), with hand checked classifications, extracted from
-handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes corresponding to "0"-"9",
-"A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. The fourth series, $hsf_4$, 
-experimentally recognized to be the most difficult one for classification task is recommended by NIST as testing set and is
-used in our work for that purpose. The performances reported by previous work on that dataset mostly use only the digits.
+The NIST Special Database 19 (NIST19) is a very widely used dataset for training and testing OCR systems. 
+The dataset is composed with over 800 000 digits and characters (upper and lower cases), with hand checked classifications,
+extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes 
+corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. 
+The fourth series, $hsf_4$, experimentally recognized to be the most difficult one for classification task is recommended 
+by NIST as testing set and is used in our work for that purpose.
+The performances reported by previous work on that dataset mostly use only the digits.
 Here we use the whole classes both in the training and testing phase.