Mercurial > pylearn
view doc/v2_planning/arch_src/plugin_JB_comments_IG.txt @ 1419:cff305ad9f60
TensorFnDataset - added x_ attribute that caches the dataset function return
value, but does not get pickled.
author | James Bergstra <bergstrj@iro.umontreal.ca> |
---|---|
date | Fri, 04 Feb 2011 16:05:22 -0500 |
parents | 16919775479c |
children |
line wrap: on
line source
-Does everything have to be all caps? I know I will get annoyed with that. - JB replies: I chose caps because a) I wanted to be able to use statements like IF, WHILE, etc. that are reserved words in Python in lower case... but this turned out not to be a large overlap b) I wanted to make up for the lack of syntax highlighting of control flow statements in VIM by making the words bigger. c) I thought it looked kinda retro-cool. Neither of these reasons is really strong, if you or others have strong feelings against caps then no problem. -Regarding overall program structure: Do you think there might be an easier to read/type way of specifying programs than building them out of constructors? This seems like it's going to lead to unwieldy proliferation of parentheses, like in LISP, but since it's an imperative language it's more likely that we'll have lots of different scopes visible at the same time, and it will be hard to tell which section is nested inside which other section if they're all just a bunch of constructor calls fed to each other. Right now it just seems to take a few layers of SWITCH and SEQ to end up with an unreadable mess: I know I'm not getting the syntax exactly matched to you proposal, but just to illustrate what I'm saying, we could have a program that looks like this: program = SEQ( A, B, SWITCH(var1, val1_1, C, val_1_2, SWITCH(var2, val_2_1, D, val_2_2, E) ) , F, SWITCH(var3, val_3_1, G, val_3_2, H) ) This seems like it could quickly turn into a nightmare, trying to count parentheses everywhere. An alternative to make it more parseable is: switch1 = SWITCH(var2, val_2_1, D, val_2_2, E) switch2 = SWITCH(var1, val1_1, C, val_1_2, switch1) switch3 = SWITCH(var3, val_3_1, G, val_3_2, H) program = SEQ( A, B, switch2 , F, switch3) This is a lot more manageable but now the parts are out of order, so the cognitive load required to debug and understand it doesn't scale well with program size. It would be much nicer if, since it is a programming language, we could write: A B SWITCH var1 val1, C val_1_2, SWITCH var2 val_2_1, D val_2_2, E F SWITCH var3 val_3_1, G val_3_2, H I can see a few different ways of accomplishing this, but of couse welcome more proposals: 1) Make a scripting language, so we pass a file into our library. We could base it on XML, maybe, if we didn't want to spend too much time making our own parser: switch.xml contains: <PyLearn> <A /> <B /> <Switch var="var_1"> <Branch val="val_1_1"> <C> </Branch> ... </Switch> </Pylearn> python pylearn.py switch.xml 2) We could make a global program compiler or have program objects that have an idea of the current scope, that you just add things to: p = pylearn.program() p(A) p(B) p(SWITCH(var1)) p( Branch(val_1_1, C)) #one annoying thing is python wouldn't let us indent things as we please ... p(END_SWITCH) ... p.compile() p.run() If we design our language to be LL(1) (fairly easy to do) then it's pretty easy to make p check that the calls to it are syntactically correct as they happen. JB replies: What I've proposed so far is a few classes for adding new program-flow constructs to Python, which is I think less ambitious and more desirable than defining a separate language. For example, the bodies of the CALL objects are all python methods (not implemented in my - i hesitate to all it a - "language") and the program itself is constructed using *python* control flow and *python* methods. I don't want to have another set of syntax rules, or have to create a macro system, or a pre-processor. I agree it would be nicer to have a more elegant syntax, but I'd much rather live with a few extra parentheses than require someone to go to the trouble of implementing that luxury. (And we can tweak the control-flow constructors to minimize the number of brackets & parentheses too). Besides, we can always implement that language & compiler later. For now we can just type the extra brackets. Perhaps I don't understand your first example - where do the definitions of A and B come from? Must they be in the same file higher up or something? In what language will they be defined? IG replies: Doesn't your proposal refer to what is being created as an "imperative language"? A, B, etc. are just placeholders for whatever kind of statement you want to fill in-- CALL, FILTER, etc. What I wrote wasn't meant to be a real program, just an example of how tree-structured programs get mapped into text. The main reason to bring up the issue of a scripting language for assembling these constructors is we need to make sure that the set of optional arguments to each constructors is such that the scripting language built on top of them is LL(1). Fortunately, that is not very hard. When we start converging on the final interface I can do the check myself. JB replies: I don't know if this is standard - but I think of a language as being ... maybe... "the set of all syntactic elements which which you make a program", and by that criterion I am not proposing a complete language. I attempt a definition because these control-flow programs are expressed in ELEMENTs that eventually bottom out at CALL(X, ...) where X is *not* defined in the control-flow language. X is a Python function that typically (and maybe necessarily - I'm not sure) knows nothing about the control-flow language that is calling it. So with this view in mind, I can't understand why or how it would make sense to define the control flow in a new language, or in XML or something. After all, how are you going to tell it what to CALL?