Carl Banks <[EMAIL PROTECTED]> wrote: ... > How about this? The decorator could generate a bytecode wrapper that > would have the following behavior, where __setlocal__ and > __execute_function__ are special forms that are not possible in > Python. (The loops would necessarily be unwrapped in the actual > bytecode.)
I'm not entirely sure how you think those "special forms" would work. Right now, say, if the compiler sees somewhere in your function z = 23 print z it thereby knows that z is a local name, so it adds a slot to the function's locals-array, suppose it's the 11th slot, and generates bytecode for "LOAD_FAST 11" and "STORE_FAST 11" to access and bind that 'z'. (The string 'z' is stored in f.func_code.co_varnames but is not used for the access or storing, just for debug/reporting purposes; the access and storing are very fast because they need no lookup). If instead it sees a "print z" with no assignment to name z anywhere in the function's body, it generates instead bytecode "LOAD_GLOBAL `z`" (where the string `z` is actually stored in f.func_code.co_names). The string (variable name) gets looked up in dict f.func_globals each and every time that variable is accessed or bound/rebound. If the compiler turns this key optimization off (because it sees an exec statement anywhere in the function, currently), then the bytecode it generates (for variables it can't be sure are local, but can't be sure otherwise either as they MIGHT be assigned in that exec...) is different again -- it's LOAD_NAME (which is like LOAD_GLOBAL in that it does need to look up the variable name string, but often even slower because it needs to look it up in the locals and then also in the globals if not currently found among the locals -- so it may often have to pay for two lookups, not just one). So it would appear that to make __setlocal__ work, among other minor revolutions to Python's code objects (many things that are currently tuples, built once and for all by the compiler at def time, would have to become lists so that __setlocal__ can change them on the fly), all the LOAD_GLOBAL occurrences would have to become LOAD_NAME instead (so, all references to globals would slow down, just as they're slowed down today when the compiler sees an exec statement in the function body). Incidentally, Python 3.0 is moving the OTHER way, giving up the chore of dropping optimization to support 'exec' -- the latter will become a function instead of a statement and the compiler will NOT get out of its way to make it work "right" any more; if LOAD_NAME remains among Python bytecodes (e.g. it may remain in use for class-statement bodies) it won't be easy to ask the compiler to emit it instead of LOAD_GLOBAL (the trick of just adding "exec 'pass'" will not work any more;-). So, "rewriting" the bytecode on the fly (to use LOAD_NAME instead of LOAD_GLOBAL, despite the performance hit) seems to be necessary; if you're willing to take those two performance hits (at decoration time, and again each time the function is called) I think you could develop the necessary bytecode hacks even today. > This wouldn't be that much slower than just assigning local variables > to locals by hand, and it would allow assignments in the > straightforward way as well. The big performance hit comes from the compiler having no clue about what you're doing (exactly the crucial hint that "assigning local variables by hand" DOES give the compiler;-) > There'd be some gotchas, so extra care is required, but it seems like > for the OP's particular use case of a complex math calculation script, > it would be a decent solution. Making such complex calculations even slower doesn't look great to me. > I understand where the OP is coming from. I've done flight > simulations in Java where there are lot of complex calculations using > symbols. This is a typical formula (drag force calculation) that I > would NOT want to have to use self.xxx for: > > FX_wind = -0.5 * rho * Vsq * Sref * (C_D_0 + C_D_alphasq*alpha*alpha + > C_D_esq*e*e) If ALL the names in every formula always refer to nothing but instance variables (no references to globals or builtins such as sin, pi, len, abs, and so on, by barenames) then there might be better tricks, ones that rely on that knowledge to actually make things *faster*, not slower. But they'd admittedly require a lot more work (basically a separate specialized compiler to generate bytecode for these cases). Alex -- http://mail.python.org/mailman/listinfo/python-list