comparison doc/v2_planning/learner.txt @ 1044:3b1fd599bafd

my first draft of my own views which are close to be just a reformulation of what James proposes
author Razvan Pascanu <r.pascanu@gmail.com>
date Wed, 08 Sep 2010 12:55:30 -0400
parents 3f528656855b
children d57bdd9a9980
comparison
equal deleted inserted replaced
1043:3f528656855b 1044:3b1fd599bafd
265 itself. In other words, there is no API through which to attach information to nodes. It is 265 itself. In other words, there is no API through which to attach information to nodes. It is
266 not good to say that the Learner instance *is* the node because (a) learner instances change 266 not good to say that the Learner instance *is* the node because (a) learner instances change
267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a 267 during graph exploration and (b) learner instances are big, and we don't want to have to keep a
268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills 268 whole saved model just to attach meta-info e.g. validation score. Choosing this API spills
269 over into other committees, so we should get their feedback about how to resolve it. 269 over into other committees, so we should get their feedback about how to resolve it.
270
271 Just another view/spin on the same idea (Razvan)
272 ================================================
273
274
275 My idea is probably just a spin off from what James wrote. It is an extension
276 of what I send on the mailing list some time ago.
277
278 Big Picture
279 -----------
280
281 What do we care about ?
282 ~~~~~~~~~~~~~~~~~~~~~~~
283
284 This is the list of the main points that I have in mind :
285
286 * Re-usability
287 * Extensibility
288 * Simplicity or easily readable code ( connected to re-usability )
289 * Modularity ( connected to extensibility )
290 * Fast to write code ( - sort of comes out of simplicity)
291 * Efficient code
292
293
294 Composition
295 ~~~~~~~~~~~
296
297 To me this reads as code generated by composing pieces. Imagine this :
298 you start of with something primitive that I will call a "variable", which
299 probably is a very unsuitable name. And then you compose those intial
300 "variables" or transform them through several "functions". Each such
301 "function" hides some logic, that you as the user don't care about.
302 You can have low-level or micro "functions" and high-level or macro
303 "functions", where a high-level function is just a certain compositional
304 pattern of low-level "functions". There are several classes of "functions"
305 and "variables" that can be inter-changable. This is how modularity is
306 obtained, by chainging between functions from a certain class.
307
308 Now when you want to research something, what you do is first select
309 the step you want to look into. If you are lucky you can re-write this
310 step as certain decomposition of low-level transformations ( there can be
311 multiple such decompositions). If not you have to implement such a
312 decompositions acording to your needs. Pick the low-level transformations you want
313 to change and write new versions that implement your logic.
314
315 I think the code will be easy to read, because it is just applying a fixed
316 set of transformations, one after the other. The one who writes the code can
317 decide how explicit he wants to write things by switching between high-level
318 and low-level functions.
319
320 I think the code this way is re-usable, because you can just take this chain
321 of transformation and replace the one you care about, without looking into
322 the rest.
323
324 You get this fractal property of the code. Zooming in, you always get just
325 a set of functions applied to a set of variables. In the begining those might
326 not be there, and you would have to create new "low level" decompositions,
327 maybe even new "variables" that get data between those decompositions.
328
329 The thing with variables here, is that I don't want this "functions" to have
330 a state. All the information is passed along through these variables. This
331 way understanding the graph is easy, debugging it is also easier ( then having
332 all these hidden states ..)
333
334 Note that while doing so we might ( and I strongly think we should) create
335 a (symbolic) DAG of operations. ( this is where it becomes what James was saying).
336 In such a DAG the "variables" will the nodes and the functions will be edges.
337 I think having a DAG is useful in many ways (all this are things that one
338 might think about implementing in a far future, I'm not proposing to implement
339 them unless we want to use them - like the reconstruction ):
340 * there exist the posibility of writing optimizations ( theano style )
341 * there exist the posibility to add global view utility functions ( like
342 a reconstruction function for SdA - extremely low level here), or global
343 view diagnostic tools
344 * the posibility of creating a GUI ( where you just create the Graph by
345 picking transforms and variables from a list ) or working interactively
346 and then generating code that will reproduce the graph
347 * you can view the graph and different granularity levels to understand
348 things ( global diagnostics)
349
350 We should have a taxonomy of possible classes of functions and possible
351 classes of variables, but those should not be exclusive. We can work at a high
352 level for now, and decompose those high level functions to lower level when
353 we need to. We can introduce new classes of functions or intermediate
354 variables between those low level functions.
355
356
357 Similarities with James' idea
358 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
359
360 As I said before, this is I think just another view on what James proposed.
361 The learner in his case is the module that traverses the graph of this
362 operations, which makes sense here as well.
363
364 The 'execute' command in his api is just applying a function to some variables in
365 my case.
366
367 The learner keeps track of the graph that is formed I think in both cases.
368
369 His view is a bit more general. I see the graph as fully created by the user,
370 and the learner just has to go from the start to the end. In his case the
371 traversal is conditioned on some policies. I think these ideas can be mixed /
372 united. What I would see in my case to have this functionality is something
373 similar to the lazy linker for Theano.
374
375
376
377