Mercurial > ift6266
comparison writeup/nips2010_submission.tex @ 480:150203d2b5c3
added number of train test and valid for NIST
author | Xavier Glorot <glorotxa@iro.umontreal.ca> |
---|---|
date | Sun, 30 May 2010 19:05:22 -0400 |
parents | 6593e67381a3 |
children | ce69aa9204d8 |
comparison
equal
deleted
inserted
replaced
479:6593e67381a3 | 480:150203d2b5c3 |
---|---|
313 extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes | 313 extracted from handwritten sample forms of 3600 writers. The characters are labelled by one of the 62 classes |
314 corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. | 314 corresponding to "0"-"9","A"-"Z" and "a"-"z". The dataset contains 8 series of different complexity. |
315 The fourth series, $hsf_4$, experimentally recognized to be the most difficult one is recommended | 315 The fourth series, $hsf_4$, experimentally recognized to be the most difficult one is recommended |
316 by NIST as testing set and is used in our work and some previous work~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005} | 316 by NIST as testing set and is used in our work and some previous work~\cite{Granger+al-2007,Cortes+al-2000,Oliveira+al-2002,Milgram+al-2005} |
317 for that purpose. We randomly split the remainder into a training set and a validation set for | 317 for that purpose. We randomly split the remainder into a training set and a validation set for |
318 model selection. The sizes of these data sets are: for training, XXX for validation, | 318 model selection. The sizes of these data sets are: 651668 for training, 80000 for validation, |
319 and XXX for testing. | 319 and 82587 for testing. |
320 The performances reported by previous work on that dataset mostly use only the digits. | 320 The performances reported by previous work on that dataset mostly use only the digits. |
321 Here we use all the classes both in the training and testing phase. This is especially | 321 Here we use all the classes both in the training and testing phase. This is especially |
322 useful to estimate the effect of a multi-task setting. | 322 useful to estimate the effect of a multi-task setting. |
323 Note that the distribution of the classes in the NIST training and test sets differs | 323 Note that the distribution of the classes in the NIST training and test sets differs |
324 substantially, with relatively many more digits in the test set, and uniform distribution | 324 substantially, with relatively many more digits in the test set, and uniform distribution |