Mercurial > ift6266
diff writeup/nips2010_submission.tex @ 524:07bc0ca8d246
added paragraph comparing "our" self-taught learning with "theirs"
author | Dumitru Erhan <dumitru.erhan@gmail.com> |
---|---|
date | Tue, 01 Jun 2010 14:06:43 -0700 |
parents | c778d20ab6f8 |
children | 4354c3c8f49c 8fe77eac344f |
line wrap: on
line diff
--- a/writeup/nips2010_submission.tex Tue Jun 01 16:06:32 2010 -0400 +++ b/writeup/nips2010_submission.tex Tue Jun 01 14:06:43 2010 -0700 @@ -688,6 +688,16 @@ it was very significant for the SDA (from +13\% to +27\% relative change). %\end{itemize} +In the original self-taught learning framework~\citep{RainaR2007}, the +out-of-sample examples were used as a source of unsupervised data, and +experiments showed its positive effects in a \emph{limited labeled data} +scenario. However, many of the results by \citet{RainaR2007} (who used a +shallow, sparse coding approach) suggest that the relative gain of self-taught +learning diminishes as the number of labeled examples increases, (essentially, +a ``diminishing returns'' scenario occurs). We note that, for deep +architectures, our experiments show that such a positive effect is accomplished +even in a scenario with a \emph{very large number of labeled examples}. + Why would deep learners benefit more from the self-taught learning framework? The key idea is that the lower layers of the predictor compute a hierarchy of features that can be shared across tasks or across variants of the