comparison doc/v2_planning/main_plan.txt @ 1189:0e12ea6ba661

fix many rst syntax error warning.
author Frederic Bastien <nouiz@nouiz.org>
date Fri, 17 Sep 2010 20:55:18 -0400
parents 1ed0719cfbce
children
comparison
equal deleted inserted replaced
1188:073c2fab7bcd 1189:0e12ea6ba661
1 1
2 Motivation 2 Motivation
3 ========== 3 ==========
4 4
5 Yoshua (points discussed Thursday Sept 2, 2010 at LISA tea-talk) 5 Yoshua (points discussed Thursday Sept 2, 2010 at LISA tea-talk)
6 ------ 6 ----------------------------------------------------------------
7 7
8 ****** Why we need to get better organized in our code-writing ****** 8 ****** Why we need to get better organized in our code-writing ******
9 9
10 - current state of affairs on top of Theano is anarchic and does not lend itself to easy code re-use 10 - current state of affairs on top of Theano is anarchic and does not lend itself to easy code re-use
11 - the lab is growing and will continue to grow significantly, and more people outside the lab are using Theano 11 - the lab is growing and will continue to grow significantly, and more people outside the lab are using Theano
149 149
150 150
151 Another thing to consider related to datasets is that there are a number of 151 Another thing to consider related to datasets is that there are a number of
152 other efforts to have standard ML datasets, and we should be aware of them, 152 other efforts to have standard ML datasets, and we should be aware of them,
153 and compatible with them when it's easy: 153 and compatible with them when it's easy:
154
154 - mldata.org (they have a file format, not sure how many use it) 155 - mldata.org (they have a file format, not sure how many use it)
155 - weka (ARFF file format) 156 - weka (ARFF file format)
156 - scikits.learn 157 - scikits.learn
157 - hdf5 / pytables 158 - hdf5 / pytables
158 159
166 found in pylearn. 167 found in pylearn.
167 168
168 Yoshua (about ideas proposed by Pascal Vincent a while ago): 169 Yoshua (about ideas proposed by Pascal Vincent a while ago):
169 170
170 - we may want to distinguish between datasets and tasks: a task defines 171 - we may want to distinguish between datasets and tasks: a task defines
171 not just the data but also things like what is the input and what is the 172 not just the data but also things like what is the input and what is the
172 target (for supervised learning), and *importantly* a set of performance metrics 173 target (for supervised learning), and *importantly* a set of performance metrics
173 that make sense for this task (e.g. those used by papers solving a particular 174 that make sense for this task (e.g. those used by papers solving a particular
174 task, or reported for a particular benchmark) 175 task, or reported for a particular benchmark)
175 176
176 - we should discuss about a few "standards" that datasets and tasks may comply to, such as 177 - we should discuss about a few "standards" that datasets and tasks may comply to, such as
177 - "input" and "target" fields inside each example, for supervised or semi-supervised learning tasks 178 - "input" and "target" fields inside each example, for supervised or semi-supervised learning tasks
178 (with a convention for the semi-supervised case when only the input or only the target is observed) 179 (with a convention for the semi-supervised case when only the input or only the target is observed)
179 - "input" for unsupervised learning 180 - "input" for unsupervised learning