pylearn: doc/v2_planning/use_cases.txt comparison

comparison doc/v2_planning/use_cases.txt @ 1106:21d25bed2ce9

use_cases: Comment about using predefined dataset dimensions

author	Olivier Delalleau <delallea@iro>
date	Mon, 13 Sep 2010 22:44:37 -0400
parents	b422cbaddc52
children	0e12ea6ba661

comparison

equal deleted inserted replaced

-:546bd0ccb0e4
+:21d25bed2ce9
 algorithms, etc.) can be swapped.
 - there are no APIs for things which are not passed as arguments (i.e. the logic
 of the whole program is not exposed via some uber-API).
+OD comments: I didn't have time to look closely at the details, but overall I
+like the general feel of it. At least I'd expect us to need something like
+that to be able to handle the multiple use cases we want to support. I must
+say I'm a bit worried though that it could become scary pretty fast to the
+newcomer, with 'lambda functions' and 'virtual machines'.
+Anyway, one point I would like to comment on is the line that creates the
+linear classifier. I hope that, as much as possible, we can avoid the need to
+specify dataset dimensions / number of classes in algorithm constructors. I
+regularly had issues in PLearn with the fact we had for instance to give the
+number of inputs when creating a neural network. I much prefer when this kind
+of thing can be figured out at runtime:
+- Any parameter you can get rid of is a significant gain in
+user-friendliness.
+- It's not always easy to know in advance e.g. the dimension of your input
+dataset. Imagine for instance this dataset is obtained in a first step
+by going through a PCA whose number of output dimensions is set so as to
+keep 90% of the variance.
+- It seems to me it fits better the idea of a symbolic graph: my intuition
+(that may be very different from what you actually have in mind) is to
+see an experiment as a symbolic graph, which you instantiate when you
+provide the input data. One advantage of this point of view is it makes
+it natural to re-use the same block components on various datasets /
+splits, something we often want to do.
 K-fold cross validation of a classifier
 ---------------------------------------
 splits = kfold_cross_validate(

Mercurial > pylearn

comparison doc/v2_planning/use_cases.txt @ 1106:21d25bed2ce9