[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: Nick, if there's an interest in reviewing the patch I can update the it. I doubt it needs a lot of changes, given that visitor is auto-generated. Raymond, the patch contains a rewrite of low-level optimizations to work before byte code generation, which simplifies them a great deal. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
New submission from Eugene Toder : Pickle documentation [1] says: """ Note: If __getstate__() returns a false value, the __setstate__() method will not be called upon unpickling. """ However, this isn't quite true. This depends on the version of pickle protocol. A small example: >>> class Pockle(object): def __getstate__(self): return 0 def __setstate__(self, state): sys.stdout.write('__setstate__ is called!\n') >>> for p in range(4): sys.stdout.write('protocol %d: ' % p) pickle.loads(pickle.dumps(Pockle(), p)) protocol 0: <__main__.Pockle object at 0x02EAE3C8> protocol 1: <__main__.Pockle object at 0x02EAE358> protocol 2: __setstate__ is called! <__main__.Pockle object at 0x02EAE3C8> protocol 3: __setstate__ is called! <__main__.Pockle object at 0x02EAE358> So for protocols >= 2 setstate is called. This is caused by object.__reduce_ex__ returning different tuples for different protocol versions: >>> for p in range(4): sys.stdout.write('protocol %d: %s\n' % (p, Pockle().__reduce_ex__(p))) protocol 0: (, (, , None)) protocol 1: (, (, , None)) protocol 2: (, (,), 0, None, None) protocol 3: (, (,), 0, None, None) Implementation of reduce_ex for protos 0-1 in copy_reg.py contains the documented check: http://hg.python.org/cpython/file/f1509fc75435/Lib/copy_reg.py#l85 Implementation for proto 2+ in typeobject.c is happy with any value: http://hg.python.org/cpython/file/f1509fc75435/Objects/typeobject.c#l3205 Pickle itself only ignores None, not any false value: http://hg.python.org/cpython/file/f1509fc75435/Lib/pickle.py#l418 I think this is a documentation issue at this point. [1] http://docs.python.org/py3k/library/pickle.html#pickle.object.__setstate__ -- components: Library (Lib) messages: 137935 nosy: eltoder priority: normal severity: normal status: open title: __setstate__ is called for false values versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3 ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
Changes by Eugene Toder : -- nosy: +alexandre.vassalotti, pitrou ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
Eugene Toder added the comment: So how about this correction? -- keywords: +patch nosy: +belopolsky, georg.brandl Added file: http://bugs.python.org/file22375/setstate.diff ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: Anyone has any thoughts on this? -- ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: I found a problem in constant de-duplication, already performed by compiler, that needs to be fixed before this change can be merged. Compiler tries to eliminate duplicate constants by putting them into a dict. However, "duplicate" in this case doesn't mean just "equal", we need a stronger relationship, as there are many equal values that behave differently in some contexts, e.g. 0 and 0.0 and False or 0.0 and -0.0. To this end for each value we create a tuple of the value and it's type and have some logic for -0.0. This is handled in compiler_add_o in Python/compile.c. This logic, however, only works for scalar values -- if we get a container with 0 and the same container with False we collapse them into one. This was not a problem before, because constant tuples were only created by peephole, which doesn't attempt de-duplication. If tuple folding is moved to AST we start hitting this problem: >>> dis(lambda: print((0,1),(False,True))) 1 0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ((0, 1)) 6 LOAD_CONST 1 ((0, 1)) 9 CALL_FUNCTION2 12 RETURN_VALUE The cleanest solution seems to be to introduce a new rich comparison code: Py_EQUIV (equivalent) and implement it at least in types that we support in marshal. This will simplify compiler_add_o quite a bit and make it work for tuples and frozensets. I'm open to other suggestions. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: They are, when there's a most specific metaclass -- the one which is a subclass of all others (described here http://www.python.org/download/releases/2.2.3/descrintro/#metaclasses, implemented here http://hg.python.org/cpython/file/ab162f925761/Objects/typeobject.c#l1956). Since ABCMeta is a subclass of type this holds. Also, in the original example there's no multiple inheritance at all. -- ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Refactor the dis module to provide better building blocks for bytecode analysis
Changes by Eugene Toder : -- nosy: -eltoder ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: As discussed on the list, peephole refuses to fold -0. The reasons for this are unclear. Folding was disabled with this commit: http://hg.python.org/cpython/diff/660419bdb4ae/Python/compile.c Here's a trivial patch to enable the folding again, along with a test case. make test passes with the patch. The change is independent from Antoine's patches. -- nosy: +eltoder Added file: http://bugs.python.org/file21073/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
New submission from Eugene Toder : Peephole optimizer performs constant folding, however 1) When it replaces operation with LOAD_CONST it always adds a new constant to co_consts, even if constant with the same value is already there. It also can add the same constant multiple times. 2) It does not remove constants that are no longer used after the operation was folded. The result is that code object after folding has more constants that it needs and so uses more memory. Attached are patches to address this. Patch for 1) comes in 2 versions. PlanA is simple (it only needs changes in peephole.c), but does linear searches through co_consts and duplicates some logic from compiler.c. PlanB needs changes in both peephole.c and compiler.c, but is free from such problems. I favour PlanB. Patch for 2) can be applied on top of either A or B. -- components: Interpreter Core messages: 130537 nosy: eltoder priority: normal severity: normal status: open title: Peephole creates duplicate and unused constants type: performance versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Changes by Eugene Toder : -- keywords: +patch Added file: http://bugs.python.org/file21074/dedup_const_plana.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: (either plana or planb should be applied) -- Added file: http://bugs.python.org/file21075/dedup_const_planb.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Changes by Eugene Toder : Added file: http://bugs.python.org/file21076/unused_consts.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: (test case) -- nosy: +pitrou Added file: http://bugs.python.org/file21077/consts_test.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: I think the changes are fairly trivial. dedup_const_planb.patch is about 10 lines of new code with all the rest being trivial plubming. unused_consts.patch may look big, but only because I factored out fix ups into a separate function; there are only about 25 lines of new code. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Antoine, sure, I'll fix it with any other suggested changes if patches will be accepted. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Mark, looks better now? -- Added file: http://bugs.python.org/file21082/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Changes by Eugene Toder : Removed file: http://bugs.python.org/file21082/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: (forgot parens around 0) -- Added file: http://bugs.python.org/file21083/fold-0.2.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > - why the "#ifndef NDEBUG" path? These entries in the table should not be used, but if something slips through and uses one of them, it's much easier to tell if we remap to invalid value. As this is an internal check, I didn't want it in release mode. If this is deemed unnecessary or confusing I can remove it. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: assert() doesn't quite work here. I need to check that this entry in the table is not used in the next loop. I'd need to put assert in that loop, but by that time I have no easy way to tell which entries are bad, unless I mark them in the first loop. so I mark them under !NDEBUG. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > _PyCode_AddObj() should be added to Include/code.h Should it be declared as PyAPI_FUNC too? This will make it unnecessarily exported (when patch in Issue11410 is merged), so I wanted to avoid that. btw, that you for reviewing the patch. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > Right now, the pattern is tokenize -> parse -> AST -> genbytecode -> > peephole optimization (which disassembles the bytecode, analyzed it > and rewrites it) -> final bytecode. The correct pattern is tokenize > -> parse -> AST -> optimize -> genbytecode -> peephole optimization > with minimal disassembly, analysis, and opcode rewrites -> final bytecode. Actually, optimizing on AST is not ideal too. Ideally you should convert it into a specialized IR, preferably in SSA form and with explicit control flow. Re size saving: I've ran make test with and without my patch and measured total size of all generated pyc files: without patch: 16_619_340 with patch: 16_467_867 So it's about 150KB or 1% of the size, not just a few bytes. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11471] If without else generates redundant jump
New submission from Eugene Toder : If statement without else part generates unnecessary JUMP_FORWARD insn with jumps right to the next insn: >>> def foo(x): if x: x = 1 >>> dis(foo) 2 0 LOAD_FAST0 (x) 3 POP_JUMP_IF_FALSE 15 6 LOAD_CONST 1 (1) 9 STORE_FAST 0 (x) 12 JUMP_FORWARD 0 (to 15) >> 15 LOAD_CONST 0 (None) 18 RETURN_VALUE This patch suppresses generation of this jump. Testing revealed another issue: when AST is produced from string empty 'orelse' sequences are represented with NULLs. However when AST is converted from Python AST objects empty 'orelse' is a pointer to 0-length sequence. I've changed this to produce NULL pointers, like in the string case. This uses less memory and doesn't introduce different code path in compiler. Without this change test_compile failed with my first change. make test passes. -- components: Interpreter Core files: if_no_else.patch keywords: patch messages: 130623 nosy: eltoder priority: normal severity: normal status: open title: If without else generates redundant jump type: performance versions: Python 3.3 Added file: http://bugs.python.org/file21091/if_no_else.patch ___ Python tracker <http://bugs.python.org/issue11471> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11471] If without else generates redundant jump
Eugene Toder added the comment: Test case (needed some refactoring to avoid duplication). -- Added file: http://bugs.python.org/file21092/if_no_else_test.patch ___ Python tracker <http://bugs.python.org/issue11471> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Thomas, can you clarify, does loading interns all constants in co_consts, or do you mean that they are mostly small numbers and the like? Without interning I think that in-memory size difference is even bigger than on-disk, as you get one entry in tuple and the object itself. I'm sure I can cook up a test that will show some perf difference, because of cache misses or paging. You can say that this is not real world code, and you will likely be right. But in real world (before you add inlining and constant propagation) constant folding doesn't make a lot of difference too, yet people asked for it and peepholer does it. Btw, Antoine just improved it quite a bit (Issue11244), so size difference with my patch should increase. My rationale for the patch is that 1) it's real very simple 2) it removes feeling of half-done job when you look at the bytecode. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Yes, my patch doesn't fix the regression, only a special case of -0. -- ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Alexander, my patch does 2 optimizations: doesn't insert a new constant if one already exist and removes unused constants after peephole is done. You patch seems to do only the latter. It's very similar, from a quick look at your patch: - My patch doesn't introduce any additional passes over the code (you added 2 passes). - It preserves doc string. - It's less code, because I reuse more of existing code. Feel free to look at the patch and tell me if you don't agree. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Raymond, you can be assured that I'm not developing this patch, unless I'm told it has chances to be accepted. I don't like to waste my time. On a related note, your review of my other patches is appreciated. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11510] Peephole breaks set unpacking
New submission from Eugene Toder : Since the time 'x in set' optimization was added to peephole it has the following bug with set unpacking: >>> def foo(x,y): a,b = {x,y} return a,b >>> dis(foo) 2 0 LOAD_FAST0 (x) 3 LOAD_FAST1 (y) 6 ROT_TWO 7 STORE_FAST 2 (a) 10 STORE_FAST 3 (b) 3 13 LOAD_FAST2 (a) 16 LOAD_FAST3 (b) 19 BUILD_TUPLE 2 22 RETURN_VALUE That is, unpacking of literal set of sizes 2 and 3 is changed to ROT instructions. This, however, changes the semantics, as construction of set would eliminate duplicates. The difference can be demonstrated like this: Python 3.1 >>> foo(1,1) Traceback (most recent call last): File "", line 1, in File "", line 2, in foo ValueError: need more than 1 value to unpack Python 3.2 >>> foo(1,1) (1, 1) -- components: Interpreter Core messages: 130917 nosy: eltoder priority: normal severity: normal status: open title: Peephole breaks set unpacking type: compile error versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue11510> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
New submission from Eugene Toder : As pointed out by Raymond, constant folding should be done on AST rather than on generated bytecode. Here's a patch to do that. It's rather long, so overview first. The patch replaces existing peephole pass with a folding pass on AST and a few changes in compiler. Feature-wise it should be on par with old peepholer, applying optimizations more consistently and thoroughly, but maybe missing some others. It passes all peepholer tests (though they seem not very extensive) and 'make test', but more testing is, of course, needed. I've split it in 5 pieces for easier reviewing, but these are not 5 separate patches, they should all be applied together. I can upload it somewhere for review or split it in other ways, let me know. Also, patches are versus 1e00b161f5f5, I will redo against head. TOC: 1. Changes to AST 2. Folding pass 3. Changes to compiler 4. Generated files (since they're checked in) 5. Tests In detail: 1. I needed to make some changes to AST to enable constant folding. These are a) Merge Num, Str, Bytes and Ellipsis constructors into a single Lit (literal) that can hold anything. For one thing, this removes a good deal of copy-paste in the code, since these are always treated the same. (There were a couple of places where Bytes ctor was missing for no apparent reason, I think it was forgotten.) Otherwise, I would have to introduce at least 4 more node types: None, Bool, TupleConst, SetConst. This seemed excessive. b) Docstring is now an attribute of Module, FunctionDef and ClassDef, rather than a first statement. Docstring is a special syntactic construction, it's not an executable code, so it makes sense to separate it. Otherwise, optimizer would have to take extra care not to introduce, change or remove docstring. For example: def foo(): "doc" + "string" Without optimizations foo doesn't have a docstring. After folding, however, the first statement in foo is a string literal. This means that docstring depends on the level of optimizations. Making it an attribute avoids the problem. c) 'None', 'True' and 'False' are parsed as literals instead of names, even without optimizations. Since they are not redefineable, I think it makes most sense to treat them as literals. This isn't strictly needed for folding, and I haven't removed all the artefacts, in case this turns out controversial. 2. Constant folding (and a couple of other tweaks) is performed by a visitor. The visitor is auto-generated from ASDL and a C template. C template (Python/ast_opt.ct) provides code for optimizations and rules on how to call it. Parser/asdl_ct.py takes this and ASDL and generates a visitor, that visits only nodes which have associated rules (but visits them via all paths). The code for optimizations itself is pretty straight-forward. The generator can probably be used for symtable.c too, removing ~200 tedious lines of code. 3. Changes to compiler are in 3 categories a) Updates for AST changes. b) Changes to generate better code and not need any optimizations. This includes tuple unpacking optimization and if/while conditions. c) Simple peephole pass on compiler internal structures. This is a better form for doing this, than a bytecode. The pass only deals with jumps to jumps/returns and trivial dead code. I've also made 'raise' recognized as a terminator, so that 'return None' is not inserted after it. 4, 5. No big surprises here. -- components: Interpreter Core messages: 130955 nosy: eltoder priority: normal severity: normal status: open title: Rewrite peephole to work on AST versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : -- keywords: +patch Added file: http://bugs.python.org/file21198/0_ast.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21199/0_fold.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21200/0_compile.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21201/0_generated.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21202/0_tests.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : -- nosy: +pitrou, rhettinger ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Is anyone reviewing my patch? It's just 1 line long. Should it be moved to another issue? Though, technically, it's needed to address the regression in question: Python 3.1 folds -0, the current code still does not. -- ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Because I don't know how to make them. Any pointers? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Thanks. Review link: http://codereview.appspot.com/4281051 -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: I see. Should I attach diffs vs. some revision from hg.python.org? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Any comments on the code so far or suggestions on how we should move forward? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: I've updated patches on Rietveld with some small changes. This includes better code generation for boolops used outside of conditions and cleaned up optimize_jumps(). This is probably the last change before I get some feedback. Also, I forgot to mention yesterday, patches on Rietveld are vs. ab45c4d0b6ef -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Just for fun I've run pystones. W/o my changes it averages to about 70k, with my changes to about 72.5k. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: AFAICT my patch has everything that #1346238 has, except BoolOps, which can be easily added (there's a TODO). I don't want to add any new code, though, until the current patch will get reviewed -- adding code will only make reviewing harder. #10399 looks interesting, I will take a look. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Is anyone looking or planing to look at the patch? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Thanks. > string concatenation will now work, and errors like "'hello' - 'world'" > should give a more informative TypeError Yes, 'x'*5 works too. > Bikeshed: We use Attribute rather than Attr for that node type, > perhaps the full "Literal" name would be better Lit seemed more in line with Num, Str, BinOp etc. No reason it can't be changed, of course. > Lib/test/disutil.py should really be made a feature of the dis module > itself Agreed, but I didn't want to widen the scope of the patch. If this is something that can be reviewed quickly, I can make a change to dis. I'd add a keyword-only arg to dis and disassembly -- an output stream defaulting to stdout. dis_to_str then passes StringIO and returns the string. Sounds OK? > Since the disassembly is interpreter specific, the new disassembly > tests really shouldn't go directly in test_compile.py. A separate > "test_ast_optimiser" file would be easier for alternate > implementations to skip over. New tests in test_compiler are not for the AST pass, but for changes to compile.c. I can split them out, how about test_compiler_opts? > I'd like to see a written explanation for the first few changes in > test_peepholer.py Sure. 1) not x == 2 can be theoretically optimized to x != 2, while this test is for another optimization. not x is more robust. 2) Expression statement which is just a literal doesn't produce any code at all. This is now true for None/True/False as well. To preserve constants in the output I've put them in tuples. > When you get around to rebasing the patch on 3.3 trunk, don't forget > to drop any unneeded "from __future__" imports. If you're referring to asdl_ct.py, that's actually an interesting question. asdl_ct.py is run by system installed python2, for obvious reasons. What is the policy here -- what is the minimum version of system python that should be sufficient to build python3? I tested my code on 2.6 and 3.1, and with __future__ it should work on 2.5 as well. Is this OK or should I drop 'with' so it runs on 2.4? > The generated code for the Lit node type looks wrong: it sets v to > Py_None, then immediately checks to see if v is NULL again. Right, comment in asdl_c.py says: # XXX: special hack for Lit. Lit value can be None and it # should be stored as Py_None, not as NULL. If there's a general agreement on Lit I can certainly clean this up. > Don't use "string" as a C type - use "char *" (and "char **" instead > of "string *"). string is a typedef for PyObject that ASDL uses. I don't think I have a choice to not use it. Can you point to a specific place where char* could be used? > There should be a new compiler flag to skip the AST optimisation step. There's already an 'optimizations level' flag. Maybe we should make it more meaningful rather than multiplying the number of flags? > A bunch of the compiler internal changes seem to make the basic flow > of the generated assembly not match the incoming source code. Can you give an example of what you mean? The changes are basically 1) standard way of handling conditions in simple compilers 2) peephole. > It doesn't seem like a good idea to conflate these with the AST > optimisation patch. If that means leaving the peepholer in to handle > them for now, that's OK - it's fine to just descope the peepholer > without removing it entirely. The reason why I think it makes sense to have this in a single change is testing. This allows to reuse all existing peephole tests. If I leave old peephole enabled there's no way to tell if my pass did something from disassembly. I can port tests to AST, but that seemed like more work than match old peepholer optimizations. Is there any opposition to doing simple optimizations on compiler structures? They seem a good fit for the job. In fact, if not for stack representation, I'd say that they are better IR for optimizer than AST. Also, can I get your opinion on making None/True/False into literals early on and getting rid of forbidden_name? Antoine, Georg -- I think Nick's question is not about AST changing after optimizations (this can indeed be a separate flag), but the structure of AST changing. E.g. collapsing of Num/Str/Bytes into Lit. Btw, if this is acceptable I'd make a couple more changes to make scope structure obvious from AST. This will allow auto-generating much of the symtable pass. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: > and with __future__ it should work on 2.5 as well. Actually, seems that at least str.format is not in 2.5 as well. Still the question is should I make it run on 2.5 or 2.4 or is 2.6 OK (then __future__ can be removed). -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: > I don't think it can: That already doesn't work in dict and set (eq not consistent with hash), I don't think it's a big problem if that stops working in some other cases. Anyway, I said "theoretically" -- maybe after some conservative type inference. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Also, to avoid any confusion -- currently my patch only runs AST optimizations before code generation, so compile() with ast.PyCF_ONLY_AST returns non-optimized AST. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: If we have to preserve backward compatibility of Python AST API, we can do this relatively easily (at the expense of some code complexity): * Add 'version' argument to compile() and ast.parse() with default value of 1 (old AST). Value 2 will correspond to the new AST. * Do not remove Num/Str/Bytes/Ellipsis Python classes. Make PyAST_obj2mod and PyAST_mod2obj do appropriate conversions when version is 1. * Possibly emit a PendingDeprecationWarning when version 1 is used with the goal of removing it in 3.5 Alternative implementation is to leave Num/Str/etc classes in C as well, and write visitors (similar to folding one) to convert AST between old and new forms. Does this sounds reasonable? Should this be posted to python-dev? Should I write a PEP (I'd need some help with this)? Are there any other big issues preventing this to be merged? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
New submission from Eugene Toder : As discussed in Issue11549 a couple of tests need to inspect disassembly of some code. Currently they have to override sys.stdout, run dis and restore stdout back. It would be much nicer if dis module provided functions that return disassembly as a string. Provided is a patch that adds file argument to most dis functions, defaulting to sys.stdout. On top of that there are 2 new functions: dis_to_str and disassembly_to_str that return disassembly as a string instead of writing it to a file. -- components: Library (Lib) files: dis.diff keywords: patch messages: 133437 nosy: eltoder priority: normal severity: normal status: open title: Add functions to return disassembly as string type: feature request versions: Python 3.3 Added file: http://bugs.python.org/file21598/dis.diff ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : Removed file: http://bugs.python.org/file21598/dis.diff ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : Added file: http://bugs.python.org/file21599/dis.patch ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : -- nosy: +ncoghlan ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Eugene Toder added the comment: Agreed, but that would require rewriting of all tests in test_peepholer. -- ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: This patch fixes the problem by moving the check from object_new to PyType_GenericAlloc. The check is very cheap, so this should not be an issue. -- keywords: +patch nosy: +eltoder Added file: http://bugs.python.org/file21600/abc.patch ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Refactor the dis module to provide better building blocks for bytecode analysis
Eugene Toder added the comment: So in the near term, dis-based tests should continue to copy/paste sys.stdout redirection code? -- ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
New submission from Eugene Toder : A comment in the definition of PyCodeObject in Include/code.h says: /* The rest doesn't count for hash or comparisons */ which, I think, makes a lot of sense. The implementation doesn't follow this comment, though. code_hash actually includes co_name and code_richcompare includes co_name and co_firstlineno. This makes hash and comparison inconsistent with each other and with the comment. -- components: Interpreter Core messages: 135015 nosy: eltoder priority: normal severity: normal status: open title: Inconsistent hash and comparison for code objects type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: I would propose changing implementation to match the comment. At a minimum, remove co_firstlineno comparison. As the last resort, at least change the comment. -- ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: It appears that * co_name was added to hash and cmp in this check-in by Guido: http://hg.python.org/cpython-fullhistory/diff/525b2358721e/Python/compile.c I think the reason was to preserve function name when defining multiple functions with the same code in one function or module. (By default function gets it's name from code, but only one code object will be preserved, since all constants in a function are stored in a dict during compilation). * co_firstlineno was added to cmp (but not hash) in this check-in by Brett: http://hg.python.org/cpython-fullhistory/rev/8127a55a57cb In an attempt to fix this bug: http://www.mail-archive.com/python-bugs-list@python.org/msg02440.html It doesn't actually fix the bug and makes hash inconsistent with cmp. I'm not convinced that the bug is valid -- why would you care if identical lambdas share or not share the code? Both changes seem come from a tension between code object's original intention to compare "functionally equivalent" codes equal and side-effects of constants de-duplication in a function (loss of function name, broken breakpoints and line numbers in a debugger). I can think of 2 ways to address this. Change hash/cmp back to ignore co_name and co_firstlineno and then: 1) Never dedup codes, or only dedup codes with the same co_firstlineno (can compare co_name too, but I can't think of a way to create 2 named funcs on the same line). This is pretty much what the current code does. 2) Separate "debug information" (co_filename, co_name, co_firstlineno, co_lnotab) from code object into a separate object. Construct a function from both objects. Allow de-duplication of both. This will always preserve all information in a function, but allows to share code object between identical functions. This is a much more intrusive change, though, e.g. frame will need a reference to debug information. -- nosy: +Jeremy.Hylton, brett.cannon, ncoghlan ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: Btw, disabling dedup for codes won't be the first exception -- we already avoid coalescing -0.0 with 0.0 for float and complex, even though they compare equal. -- ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
Change by Eugene Toder : -- nosy: +brett.cannon ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
Change by Eugene Toder : -- nosy: +ncoghlan, vstinner ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
New submission from Eugene Toder : It's convenient to use @lru_cache on functions with no arguments to delay doing some work until the first time it is needed. Since @lru_cache is implemented in C, it is already faster than manually caching in a closure variable. However, it can be made even faster and more memory efficient by not using the dict at all and caching just the one result that the function returns. Here are my timing results. Before my changes: $ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()" 500 loops, best of 5: 42.2 nsec per loop $ ./python -m timeit -s "import functools; f = functools.lru_cache(None)(lambda: 1)" "f()" 500 loops, best of 5: 38.9 nsec per loop After my changes: $ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()" 1000 loops, best of 5: 22.6 nsec per loop So we get improvement of about 80% compared to the default maxsize and about 70% compared to maxsize=None. -- components: Library (Lib) messages: 384883 nosy: eltoder, serhiy.storchaka, vstinner priority: normal severity: normal status: open title: optimize lru_cache for functions with no arguments type: performance versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: As you can see in my original post, the difference between @cache (aka @lru_cache(None)) and just @lru_cache() is negligible in this case. The optimization in this PR makes a much bigger difference. At the expense of some lines of code, that's true. Also, function calls in Python are quite slow, so being faster than a function call is not a high bar. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: Ammar, thank you for the link. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: @cache does not address the problem or any of the concerns brought up in the thread. Thread-safe @once is a nice idea, but more work of course. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
New submission from Eugene Toder : If a module has a loader, linecache calls its get_source() passing __name__ as the argument. This works most of the time, except that the __main__ module has it set to "__main__", which is commonly not the real name of the module. Luckily, we now have __spec__ which has the real name, so we can just use it. Attached zip file reproduces the problem: $ python t.zip Traceback (most recent call last): ... File "t.zip/t.py", line 11, in File "t.zip/t.py", line 8, in f File "t.zip/t.py", line 8, in f File "t.zip/t.py", line 8, in f [Previous line repeated 2 more times] File "t.zip/t.py", line 7, in f ValueError Note that entries from t.py don't have source code lines. -- components: Library (Lib) files: t.zip messages: 379408 nosy: eltoder priority: normal severity: normal status: open title: linecache cannot get source for the __main__ module with a custom loader type: behavior versions: Python 3.10 Added file: https://bugs.python.org/file49536/t.zip ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: > Method calls on literals are always fair game, though (e.g. you could > optimise "a b c".split()) What about optimizations that do not change behavior, except for different error messages? E.g. we can change y = [1,2][x] to y = (1,2)[x] where the tuple is constant and is stored in co_consts. This will, however, produce a different text in the exception when x is not 0 or 1. The type of exception is going to be the same. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: If I'm not missing something, changing x in [1,2] to x in (1,2) and x in {1,2} to x in frozenset([1,2]) does not change any error messages. Agreed that without dynamic compilation we can pretty much only track literals (including functions and lambdas) assigned to local variables. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16602] weakref can return an object with 0 refcount
New submission from Eugene Toder: An interaction between weakrefs and trashcan can cause weakref to return the object it's pointing to after object's refcount is already 0. Given that the object is usually increfed and decrefed, this leads to double dealloc and crashing or hanging. Tested 2.6.6, 2.7.2 and 3.3.0 on win7. In more details. The documentation for Py_TRASHCAN_SAFE_BEGIN tells to put it right after PyObject_GC_UnTrack. This means that PyObject_ClearWeakRefs goes after it, and, for example, in subtype_dealloc of Objects/typeobject.c this is indeed the case. This means that if we get into a long chain of deallocations and the trashcan kicks in, we get an object with 0 refcount and without cleared weakrefs lying in trash_delete_later until we go PyTrash_UNWIND_LEVEL levels up. During that time we can execute arbitrary python code, so all we need is some code with an access to the weakref to dereference it. The current recommendation of clearing weakrefs before clearing attributes makes this less likely to happen, but it's still easy enough: import weakref class C: def __init__(self, parent): if not parent: return wself = weakref.ref(self) def cb(wparent): o = wself() self.wparent = weakref.ref(parent, cb) d = weakref.WeakKeyDictionary() root = c = C(None) for n in range(100): d[c] = c = C(c) print('deleting') del root print('done') In this case weakref callback in WeakKeyDictionary deletes the reference to the next object, causing the next level of destruction, until trashcan kicks in. Trashcan delays clearing of weakrefs, allowing the second weakref's (wparent) callback to see the dead object via wself that it captured. Attached is a similar example with less weakrefs using __del__. -- components: Interpreter Core files: wr2.py messages: 176874 nosy: eltoder, pitrou priority: normal severity: normal status: open title: weakref can return an object with 0 refcount type: crash versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3 Added file: http://bugs.python.org/file28204/wr2.py ___ Python tracker <http://bugs.python.org/issue16602> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16602] weakref can return an object with 0 refcount
Eugene Toder added the comment: Agreed. That's what I've put in my code as a work around. -- ___ Python tracker <http://bugs.python.org/issue16602> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16602] weakref can return an object with 0 refcount
Eugene Toder added the comment: Thank you, Antoine. This change has one effect that's worth highlighting in NEWS at least -- the PyWeakref_GET_OBJECT() macro now evaluates its argument twice. This can break existing code where the argument has side-effects, e.g. o = PyWeakref_GET_OBJECT(p++). I found one such case in our code base, but I don't know how common this is. So this is something to watch out for when upgrading. I don't think there's a way to write PyWeakref_GET_OBJECT() in standard C90 without double evaluation. -- ___ Python tracker <http://bugs.python.org/issue16602> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16602] weakref can return an object with 0 refcount
Eugene Toder added the comment: > People should simply avoid doing this kind of thing, as it's knowingly > fragile, and trivial to avoid anyway. Is this documented in the C API guide, or somewhere else? In any case, notifying people so they can quickly check their code seems much nicer than having them figure this out the hard way. -- ___ Python tracker <http://bugs.python.org/issue16602> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1062277] Pickle breakage with reduction of recursive structures
Eugene Toder added the comment: To recap, the issue is that pickle doesn't handle recursion via reduce arguments (i.e. arguments to the constructor function as returned in 2nd element of the tuple from __reduce__). This leads to 2 kind of effects: class C: def __init__(self, x=None): self.x = x if x is not None else self def __reduce__(self): return C, (self.x,) A. Recursion error: >>> pickle.dumps(C()) Traceback (most recent call last): File "", line 1, in pickle.dumps(C()) RuntimeError: maximum recursion depth exceeded while calling a Python object This cannot be helped with the current reduce protocol. The error may be improved, but that's about it. B. Duplication of object when unpickling: >>> c = C([]) >>> c.x.append(c) >>> c.x[0] is c True >>> c2 = pickle.loads(pickle.dumps(c)) >>> c2.x[0] is c2 False This happens because list (or another recursion-friendly type) inside the problematic object handles recursion, but we still get the outer object, identical to the inner one. This can be solved the same way as for tuple: >>> t=([],1,2) >>> t[0].append(t) >>> t2 = pickle.loads(pickle.dumps(t)) >>> t2[0][0] is t2 True >>> pickletools.dis(pickle.dumps(t)) 0: \x80 PROTO 3 2: ]EMPTY_LIST 3: qBINPUT 0 5: hBINGET 0 7: KBININT11 9: KBININT12 11: \x87 TUPLE3 12: qBINPUT 1 14: aAPPEND 15: KBININT11 17: KBININT12 19: 0POP 20: 0POP 21: 0POP 22: hBINGET 1 24: .STOP After pickling its elements tuple checks if it got into memo. If it did, this means it was pickled by one of the elements, so it POPs all elements from the stack and fetches itself via GET. This is somewhat inefficient, but probably the best it can do. I suggest we do 3 things: 1. Improve the documentation for __reduce__ function. It should mention that all state that a) can potentially point back to the object and b) not strictly necessary in the constructor function should be passed via the 3rd element of __reduce__ tuple (aka state) instead of the 2nd element, and applied by __setstate__. This handles recursion in robust and optimal way. 2. Check that all built-in/standard types follow this advice. I see that Stefan Mihaila already fixed sets. 3. To fix case B above add the memo check from save_tuple to save_reduce. While at it, we can consider checking after pickling every element instead of after pickling all elements, so we reduce the number of POPs and the wastage they cause. -- nosy: +eltoder, pitrou ___ Python tracker <http://bugs.python.org/issue1062277> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1062277] Pickle breakage with reduction of recursive structures
Changes by Eugene Toder : -- versions: +Python 3.3, Python 3.4 -Python 3.1 ___ Python tracker <http://bugs.python.org/issue1062277> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: My comment will make more sense if you follow the links that I provided. Brett's check-in (http://hg.python.org/cpython-fullhistory/rev/8127a55a57cb) says that it fixes bug #1190011 (http://www.mail-archive.com/python-bugs-list@python.org/msg02440.html). The bug claims that: >> def f(): return ((lambda x=1: x), (lambda x=2: x)) >> f1, f2 = f() >> id(f1.func_code) != id(f2.func_code) The above does not hold true. It should according to test_compile (reported in HEAD as bug #1048870). Certainly comparing co_firstlineno can't fix this, as both lambdas are defined at the same line. There are some more comments by Nick Coghlan on that bug, but I can't find anything clearly stating what the problem is believed to be and what the resolution is. I understand the reason for Brett's fix and why fixing 1190011 literally is not required. The issue was strange debugger/profiler/etc behavior with identical lambdas defined on _different_ lines. Since python tools typically have line resolution (e.g. debugger doesn't know column position of the current statement), merging lambdas defined on the same line is mostly invisible, and does not cause problems. So to clarify my opening comment -- this is mostly a documentation issue. The only documentation for the behavior of code cmp is the comment in Include/code.h, which says: /* The rest doesn't count for hash or comparisons */ Both hash and compare break this in different ways. At a minimum the comment should be fixed. If the current behavior is considered right, this can be it. My other point is that we can have code objects with different co_firstlineno compare equal and still not have them deduplicated by the compiler (thus avoid problems with debugger). This can be done similar to how float -0.0 is equal to 0.0, but still not deduplicated. Separating these concerns can allow both most logical compare behavior and good support for debugging. Re hash legally returning same values for non-equal objects -- sure, only a==b implies hash(a)==hash(b), and not the other way round. But this doesn't mean that we should just return 42 from all hash functions. Not including a field in hash that you include compare needs a very good reason. If there's such a reason in this case, please let me know. -- status: pending -> open ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: If you add co_firstlineno into code_hash you can say something like /* The rest isn't used in hash and comparisons, except co_name and co_firstlineno, which are preserved for tracebacks and debuggers. */ (Otherwise you'd need to explain why co_firstlineno is not in hash.) BTW, since co_firstlineno is used including co_name is redundant -- there's no way to have 2 named code objects on the same line, AFAIK. -- ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: > We have already Constant and NameConstant. So it seems there are no need for > None, Bool, TupleConst, SetConst nodes. Yes, Constant is Victor's version of Lit. > I think converting Num, Str, Bytes, Ellipsis into Constant in folding stage > is easier than fixing all tests. Fixing tests was fairly easy the last time. I think the question is what changes to the public API of AST are acceptable. > Take docstring before constant folding isn't enough? > (I'm sorry if I'm wrong. I haven't tried it yet.) It may be doable, but seems very messy. Instead of a clean pipeline text -> AST -> Optimized AST -> bytecode, you have to collect all docstrings, and pass them around in a side structure. With the current code there can be a simple fix. If original string literals are Str, but constant-folded string constants are Constant, only treat Strs as docstrings. Then the optimizer needs a change to always preserve Str as a first statement in a function/module, and that's it. I still think that having a dedicated docstring attribute in AST is cleaner, though. > They are all NameConstant already. Keep in mind this patch is 6 years old :) -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: Yes, doing optimizations on AST in CPython is unlikely to give any sizable speed improvements in real world programs. Python as a language is not suited for static optimization, and even if you manage to inline a function, there's still CPython's interpreted overhead and boxed types that dwarf the effect of the optimization. The goal of this patch was never to significantly improve the speed. It was to replace the existing bytecode peephole pass with cleaner and simpler code, which also happens to produce slightly better results. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24912] The type of cached objects is mutable
Eugene Toder added the comment: Nathaniel, what if we allow creating module objects from an existing dict instead of always creating a new one. Does this solve your namespace diversion problem? FWIW, when I discovered this change I was quite surprised this came through without a PEP. I agree that the old rules were somewhat arbitrary, but this is a significant change with non-trivial compatibility concerns. -- ___ Python tracker <http://bugs.python.org/issue24912> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24912] The type of cached objects is mutable
Eugene Toder added the comment: Guido: IIUC the general intention is to support @property and __getattr__ style hooks on the module level. Assigning to sys.modules[name] from the module itself is a bit too late -- there's already a global dict for this module, and no way to create a module from an existing dict, so you end up with two global namespaces which is annoying (but admittedly people live with that). One way around that is to set up import hooks, but this is also annoying, and leaks module implementation outside of its code. -- ___ Python tracker <http://bugs.python.org/issue24912> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24991] Define instance mutability explicitly on type objects
Changes by Eugene Toder : -- nosy: +eltoder ___ Python tracker <http://bugs.python.org/issue24991> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1062277] Pickle breakage with reduction of recursive structures
Eugene Toder added the comment: Davis, the possible cases (i.e. recursion via appropriate mutable objects) are already supported, and the impossible cases are, well, impossible. The only open issue is whether to implement better error handling for infinite recursion problems, instead of relying on built-in maximum recursion depth. -- ___ Python tracker <http://bugs.python.org/issue1062277> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23640] int.from_bytes() is broken for subclasses
Eugene Toder added the comment: There's a similar issue with replace() methods on date/time/datetime classes. They create instances of derived types without calling derived __new__/__init__, thus potentially leaving those uninitialized. >>> from datetime import date >>> class D(date): ... def __init__(self, y, m, d): ... self.y = y >>> D(2016,1,1).y 2016 >>> D(2016,1,1).replace(2015).y Traceback (most recent call last): File "", line 1, in AttributeError: 'D' object has no attribute 'y' -- nosy: +eltoder ___ Python tracker <http://bugs.python.org/issue23640> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23640] int.from_bytes() is broken for subclasses
Eugene Toder added the comment: They sure do. Check the example in my comment, or if you prefer the source code, it's pretty clear as well: https://hg.python.org/cpython/file/fff0c783d3db/Modules/_datetimemodule.c#l2770 -- ___ Python tracker <http://bugs.python.org/issue23640> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: Fairly sure it's 5 years old. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: Serhiy: Nice! Yes, _PyCode_ConstantKey solved the problem. But #16619 went in the opposite direction of this patch, and introduced a new type of literal node instead of unifying the existing ones. Kind of a shame, since *this* patch, I believe, both fixes that bug and removes the unreachable code in the example :) I also see that Victor has been doing some of the same work, e.g. #26146. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20371] datetime.datetime.replace bypasses a subclass's __new__
Eugene Toder added the comment: Properly supporting subclasses in replace() is hard, at least without some cooperation from subclasses. For a proper replace() x.replace(a=newval).b == x.b should hold for every b not dependent on a, including ones added by subclasses. That is, it should replicate subclass state. Arguably, ideal replace() would also allow changing attributes defined by subclasses -- otherwise subclasses need to override it anyway, and all this effort was for nothing. The best I can think of is to assume that subclasses are immutable and all "primary" properties are settable via constructor arguments with the same names. Then replace() can be implemented like this: def replace(self, *args, **kwargs): sig = inspect.signature(self.__new__) bound = sig.bind_partial(type(self), *args, **kwargs) for arg in sig.parameters: if arg not in bound.arguments: bound.arguments[arg] = getattr(self, arg) return self.__new__(*bound.args, **bound.kwargs) This will not work for subclasses defined in C, but at least we can show a nice error about that. This will also not work if __new__ uses *args or **kwargs instead of listing every property as its own argument. (Another approach I can think of is patching properties on self, making a copy of self via __reduce__, and reverting values on self back. This doesn't rely on any signatures, but gets really dirty really quick.) So I don't know if we want to implement this, or if returning base class from replace() is a better choice. -- nosy: +eltoder, serhiy.storchaka ___ Python tracker <http://bugs.python.org/issue20371> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20371] datetime.datetime.replace bypasses a subclass's __new__
Eugene Toder added the comment: Yes, this is similar to my second approach. (You need to copy via __reduce__, because __copy__ my have been overriden to return self.) The general problem with it, is that if a subclass is designed to be immutable (probably a common case here), it may not handle changing of its attributes well. Attributes will be readonly, so you can't support replace(derived_attr=newval). And while you can change base class' attributes under the covers, doing so may break subclass invariants. E.g. it may have cached hashcode or another property derived from other attributes. There's also a smaller problem of figuring attribute names for positional arguments of replace() to support derived attributes. -- ___ Python tracker <http://bugs.python.org/issue20371> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20371] datetime.datetime.replace bypasses a subclass's __new__
Eugene Toder added the comment: This seems like it can lead to even more subtle bugs when replace() is not overriden. Currently, any extra state in the subclass is left uninitialized, which usually leads to obvious breakage and is relatively easy to trace back to replace(). (I've done it before.) If we call the constructor passing only base class values, the extra state may get initialized with default values. This is probably not what anyone wants, and is probably harder to debug, because there's no obvious breakage. -- ___ Python tracker <http://bugs.python.org/issue20371> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20371] datetime.datetime.replace bypasses a subclass's __new__
Eugene Toder added the comment: namedtuple._replace() actually doesn't call subclass' __new__. It calls tuple.__new__ directly, so it has the same problem as datetime classes. Parameter and Signature are new in 3.3. I'm not sure if they're expected to be used as base classes. @r.david.murray: is that contract specified anywhere? The doc says "Return a datetime with the same attributes, except for those attributes given new values by whichever keyword arguments are specified." This doesn't explicitly mention subclasses, but also doesn't mention the possibility of discarding any attribute values. -- ___ Python tracker <http://bugs.python.org/issue20371> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
New submission from Eugene Toder: As far as I can tell, if a new class does not add any new fields, and its base class doesn't use GC, there's no reason to enable GC for the new class. This is useful for creating lightweight wrappers around classes implemented in C. -- files: class_gc.diff keywords: patch messages: 238718 nosy: eltoder, pitrou priority: normal severity: normal status: open title: Don't enable GC for classes that don't add new fields type: performance versions: Python 3.6 Added file: http://bugs.python.org/file38610/class_gc.diff ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
Eugene Toder added the comment: Agreed, but this is not new. This works without my change: >>> class Tuple(tuple): ... __slots__ = () ... def __repr__(self): return 'Imma tuple!' ... >>> ().__class__ = Tuple >>> () Imma tuple! -- ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
Eugene Toder added the comment: Actually, this is rather new -- new in 3.5. The check was relaxed in #22986: https://hg.python.org/cpython/rev/c0d25de5919e Previously only heap types allowed re-assigning __class__. -- ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
Changes by Eugene Toder : Added file: http://bugs.python.org/file38624/class_gc2.diff ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
Eugene Toder added the comment: Thank you! Benjamin, Nathaniel, any opinion if we should restrict reassigning __class__ for types like tuple, int and str, where some/many instances are cached? -- nosy: +njs ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23726] Don't enable GC for classes that don't add new fields
Eugene Toder added the comment: Agreed. There's a small problem with that, as far as I know. Nothing on type declares that it is immutable. -- ___ Python tracker <http://bugs.python.org/issue23726> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com