Mercurial > pylearn
view doc/v2_planning/arch_src/plugin_JB_comments_RP.txt @ 1288:a165f2666643
cifar10 - added support for "all" split
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Wed, 29 Sep 2010 18:35:40 -0400 |
parents | 699ed5f5f188 |
children |
line wrap: on
line source
I agree with Ian, maybe using caps is not the best idea. It reminds be of BASIC which I used to do long time ago :). It also makes the code look a bit scary. JB replies: personally i think it makes the code look more AWESOME but I could go either way. See reply to Ian in plugin_JB_comments_IG.txt I like the approach and I think it goes close to my earliest proposition and to what I am proposing for the layer committeee ( though we did not have a meeting yet). I would though write it in a more Theano like ( Ian has a example of how that would look). I would also drop the CALL and FILT constructs, and actually have a decorator ( or something ) that wraps around a function to transform it into a call or filt. I hope that this is only syntactic sugar ( does this change anything in the actual implementation ?? ) that makes things more natural. What I want to reach is something that looks very much as Theano, just that now you are creating the graph of execution steps. Refractoring what you wrote this will look like x = buffer_repeat( 1000, dataset.next()) train_pca = pca.analyze(x) train_pca.run() If you allow a FILT to also get multiple inputs ( so not just the one) which comes natural in this way of writing you can get to describe a DAG that not only describes the order of execution but also deals with what takes data from what. I'm sorry for not being there yesturday, from what I remember I have the feeling that for you that is done under the hood and not taken care by this flow control structures. To be a bit more explicit, in the way of writing the code above you can see that : a) dataset_next() has to run before pca_analyze b) pca_analyze needs the result (data) object of buffer_repeat( dataset.next()) I've actually elaborated on this idea here and there, and figured out what the result from such a control flow thing is, and how to make everything explicit in the graph. Parts of this is in my plugin_RP.py ( Step 1) though it is a bit of a moving target. I also have a sligtly different way of writing REPEAT and BUFFER_REPEAT .. though I think is mostly the same. I actually did not know how to deal with distributed things until I saw how you deal with that in your code. Copy-pasted a version of a SDAA with my way of writing : ## Layer 1: data_x,data_y = GPU_transform(load_mnist()) noisy_data_x = gaussian_noise(data_x, amount = 0.1) hidden1 = tanh(dotW_b(data_x, n_units = 200)) reconstruct1 = reconstruct(hidden1.replace(data_x, noisy_data_x), noisy_data_x) err1 = cross_entropy(reconstruct1, data_x) learner1 = SGD(err1) # Layer 2 : noisy_hidden1 = gaussian_noise(hidden1, amount = 0.1) hidden2 = tanh(dotW_b(hidden1, n_units = 200)) reconstruct2 = reconstruct(hidden2.replace(hidden1,noisy_hidden1), noisy_hidden1) err2 = cross_entropy(reconstruct2, hidden) learner2 = SGD(err2) # Top layer: output = sigmoid(dotW_b(hidden2, n_units = 10)) err = cross_entropy(output, data_y) learner = SGD(err) GPU_transform,gaussian_noise and so on are functions that have been decorated ( or classes if you want) that you would write using FILT. Reconstruct for me is a different CONTROL FLOW element. In this case I don't use REPEAT or BUFFER_REPEAT or the other very cool control flow elements, but you can easily imagine writing something like pretrained_in_parallel = weave( learner1, learner2) results = spawn(repeat(5000,learner1),repeat(500,learner2)) JB replies: This reply makes it clearer to me that I was not sensitive enough to the difference between *expressions* and *control-flow statements*. What you have above is a graph of declarative expressions (if I understand correctly) with certain properties: - they have no side effects - they can be re-ordered within dependency constraints Contrast this with the CALL statements in my proposal: - they work primarily by side effect - they cannot be re-ordered at all So the fact that CALL currently works by side effect means that there is almost no graph-manipulation that can be guaranteed not to change the program. This is a reason to make CALL statements *encapsulate* programs constructed using declarative constructs (i.e. Theano functions) In other words, in this short term, this feels to me like the reason to *not* mix Theano graph building with this control-flow business. Consequently, I think I will remove the BUFFER_REPEAT construct since that is really an expression masquerading as a control flow statement, and I will remove FILT too. RP asks: I understand now the difference between what you wrote and what I had in mind. Though I don't undestand the argument against it. Do you mean to say that writing it the way I proposed implies a much more complicated backbone framework which will take us to long to develop? Or is there something else that you meant ? JB replies: I don't think it's necessary to combine theano with this control-flow proposal, and I don't know how to do it. Yes, it seems like it would be hard and/or awkward, and I don't even really see the advantage of even trying to do it. RP: I think you misunderstood me. I did not propose to mix Theano with the library. I agree that would be awkward. What I had in mind ( which might be just something different from what you are doing) is to use some concepts from how Theano deals with things. For example right now you added registers. Writing something like: CALL( fn, arg1, arg2, arg3, _set= reg('x') ) means actually reg('x') = fn (arg1,arg2,arg3) You get most of what you want because this control flow elements don't actually get executed until you run the program. That means that you have a fixed simple graph ( you can't play around with it) that tells how to execute your commands. You can save that graph, and the point in the graph where you stop so that you can resume latter. You can also save all registers at that point. Why not have that fn instead of being a python function, be some class that implements a method run which does what your call would do. The init/ or __call__ of that class would do what CALL does in your case. Do you think that would be impossible to implement? Any such function could either return a set of registers or not. Your other control flow things will be just special such functions. The only thing that would might look a bit strange would be the sequence in case you need to return things. Maybe I could use the same trick, namely a _set arguemnt to __call__. I'm not against your approach, I just think it can be written a bit differently, which in my opinion is easier to read, understand and so on. I will have nothing against if we decide to write it exactly how you propose and I'm sure that I will get the hang of it pretty fast. Bottom line (in my view): - I don't say we should mix Theano with anything - I think writing things such that it looks like applying functions to object is a more natural way, easy to understand for noobs - Writing a new functions by inheriting a class and implementing a method is also natural - I do not propose to do optimizations or play with the graph ! I do though think that you should be able to : * replace parts of a subgraph with a different * automatically collect hyper-parameters or parameters if you ever want to * change the value of these somehow