[issue24978] Contributing to Python 2x and 3x Documentation. Translation to Russian.
New submission from Eugene: I am very much begging pardon if this is not the place to ask for, but.. I would like to make a thorough translation of the official Library Reference and the beginners guide to Python 2.x 3.x in Russian language. I am aware this type of translation will be placed at https://wiki.python.org/moin/RussianLanguage as current https://docs.python.org/2/ does not allow changing language. I am pretty much aware the translation should not sound any differently from the original in terms of its style, form and overall attitude. And I would also like to know, if possible, can this link be promoted somewhere at python.org. I am not in the slightest have the intentions of promoting in anywhere but in the place where most russian programmers could most definitely find it - at the oficial Python website. Kind regards, Evgeniy. -- assignee: docs@python components: Documentation messages: 249478 nosy: docs@python, overr1de priority: normal severity: normal status: open title: Contributing to Python 2x and 3x Documentation. Translation to Russian. type: enhancement versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue24978> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: Nick, if there's an interest in reviewing the patch I can update the it. I doubt it needs a lot of changes, given that visitor is auto-generated. Raymond, the patch contains a rewrite of low-level optimizations to work before byte code generation, which simplifies them a great deal. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
New submission from Eugene Toder : Pickle documentation [1] says: """ Note: If __getstate__() returns a false value, the __setstate__() method will not be called upon unpickling. """ However, this isn't quite true. This depends on the version of pickle protocol. A small example: >>> class Pockle(object): def __getstate__(self): return 0 def __setstate__(self, state): sys.stdout.write('__setstate__ is called!\n') >>> for p in range(4): sys.stdout.write('protocol %d: ' % p) pickle.loads(pickle.dumps(Pockle(), p)) protocol 0: <__main__.Pockle object at 0x02EAE3C8> protocol 1: <__main__.Pockle object at 0x02EAE358> protocol 2: __setstate__ is called! <__main__.Pockle object at 0x02EAE3C8> protocol 3: __setstate__ is called! <__main__.Pockle object at 0x02EAE358> So for protocols >= 2 setstate is called. This is caused by object.__reduce_ex__ returning different tuples for different protocol versions: >>> for p in range(4): sys.stdout.write('protocol %d: %s\n' % (p, Pockle().__reduce_ex__(p))) protocol 0: (, (, , None)) protocol 1: (, (, , None)) protocol 2: (, (,), 0, None, None) protocol 3: (, (,), 0, None, None) Implementation of reduce_ex for protos 0-1 in copy_reg.py contains the documented check: http://hg.python.org/cpython/file/f1509fc75435/Lib/copy_reg.py#l85 Implementation for proto 2+ in typeobject.c is happy with any value: http://hg.python.org/cpython/file/f1509fc75435/Objects/typeobject.c#l3205 Pickle itself only ignores None, not any false value: http://hg.python.org/cpython/file/f1509fc75435/Lib/pickle.py#l418 I think this is a documentation issue at this point. [1] http://docs.python.org/py3k/library/pickle.html#pickle.object.__setstate__ -- components: Library (Lib) messages: 137935 nosy: eltoder priority: normal severity: normal status: open title: __setstate__ is called for false values versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3 ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
Changes by Eugene Toder : -- nosy: +alexandre.vassalotti, pitrou ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12290] __setstate__ is called for false values
Eugene Toder added the comment: So how about this correction? -- keywords: +patch nosy: +belopolsky, georg.brandl Added file: http://bugs.python.org/file22375/setstate.diff ___ Python tracker <http://bugs.python.org/issue12290> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: Anyone has any thoughts on this? -- ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: I found a problem in constant de-duplication, already performed by compiler, that needs to be fixed before this change can be merged. Compiler tries to eliminate duplicate constants by putting them into a dict. However, "duplicate" in this case doesn't mean just "equal", we need a stronger relationship, as there are many equal values that behave differently in some contexts, e.g. 0 and 0.0 and False or 0.0 and -0.0. To this end for each value we create a tuple of the value and it's type and have some logic for -0.0. This is handled in compiler_add_o in Python/compile.c. This logic, however, only works for scalar values -- if we get a container with 0 and the same container with False we collapse them into one. This was not a problem before, because constant tuples were only created by peephole, which doesn't attempt de-duplication. If tuple folding is moved to AST we start hitting this problem: >>> dis(lambda: print((0,1),(False,True))) 1 0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ((0, 1)) 6 LOAD_CONST 1 ((0, 1)) 9 CALL_FUNCTION2 12 RETURN_VALUE The cleanest solution seems to be to introduce a new rich comparison code: Py_EQUIV (equivalent) and implement it at least in types that we support in marshal. This will simplify compiler_add_o quite a bit and make it work for tuples and frozensets. I'm open to other suggestions. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: They are, when there's a most specific metaclass -- the one which is a subclass of all others (described here http://www.python.org/download/releases/2.2.3/descrintro/#metaclasses, implemented here http://hg.python.org/cpython/file/ab162f925761/Objects/typeobject.c#l1956). Since ABCMeta is a subclass of type this holds. Also, in the original example there's no multiple inheritance at all. -- ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Refactor the dis module to provide better building blocks for bytecode analysis
Changes by Eugene Toder : -- nosy: -eltoder ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6083] Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords
Eugene Kapun added the comment: Actually, this can't be fixed without modifying C API methods PyArg_ParseTuple and PyArg_ParseTupleAndKeywords, because it's possible to make an object deallocated before PyArg_ParseTuple returns, so Py_INCREF immediately after parsing would be already too late. Here are my test cases: test-resource.py - in Modules/resource.c, and python-bug-01.patch won't work against it. test-ctypes.py - in Modules/_ctypes/_ctypes.c. test-functools.py - in Modules/_functoolsmodule.c (py3k only). -- components: +Interpreter Core -Extension Modules nosy: +abacabadabacaba title: Reference counting bug in setrlimit -> Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords Added file: http://bugs.python.org/file19616/test-resource.py ___ Python tracker <http://bugs.python.org/issue6083> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6083] Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords
Changes by Eugene Kapun : Added file: http://bugs.python.org/file19617/test-ctypes.py ___ Python tracker <http://bugs.python.org/issue6083> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6083] Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords
Changes by Eugene Kapun : Added file: http://bugs.python.org/file19618/test-functools.py ___ Python tracker <http://bugs.python.org/issue6083> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: As discussed on the list, peephole refuses to fold -0. The reasons for this are unclear. Folding was disabled with this commit: http://hg.python.org/cpython/diff/660419bdb4ae/Python/compile.c Here's a trivial patch to enable the folding again, along with a test case. make test passes with the patch. The change is independent from Antoine's patches. -- nosy: +eltoder Added file: http://bugs.python.org/file21073/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
New submission from Eugene Toder : Peephole optimizer performs constant folding, however 1) When it replaces operation with LOAD_CONST it always adds a new constant to co_consts, even if constant with the same value is already there. It also can add the same constant multiple times. 2) It does not remove constants that are no longer used after the operation was folded. The result is that code object after folding has more constants that it needs and so uses more memory. Attached are patches to address this. Patch for 1) comes in 2 versions. PlanA is simple (it only needs changes in peephole.c), but does linear searches through co_consts and duplicates some logic from compiler.c. PlanB needs changes in both peephole.c and compiler.c, but is free from such problems. I favour PlanB. Patch for 2) can be applied on top of either A or B. -- components: Interpreter Core messages: 130537 nosy: eltoder priority: normal severity: normal status: open title: Peephole creates duplicate and unused constants type: performance versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Changes by Eugene Toder : -- keywords: +patch Added file: http://bugs.python.org/file21074/dedup_const_plana.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: (either plana or planb should be applied) -- Added file: http://bugs.python.org/file21075/dedup_const_planb.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Changes by Eugene Toder : Added file: http://bugs.python.org/file21076/unused_consts.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: (test case) -- nosy: +pitrou Added file: http://bugs.python.org/file21077/consts_test.patch ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: I think the changes are fairly trivial. dedup_const_planb.patch is about 10 lines of new code with all the rest being trivial plubming. unused_consts.patch may look big, but only because I factored out fix ups into a separate function; there are only about 25 lines of new code. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Antoine, sure, I'll fix it with any other suggested changes if patches will be accepted. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Mark, looks better now? -- Added file: http://bugs.python.org/file21082/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Changes by Eugene Toder : Removed file: http://bugs.python.org/file21082/fold-0.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: (forgot parens around 0) -- Added file: http://bugs.python.org/file21083/fold-0.2.patch ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > - why the "#ifndef NDEBUG" path? These entries in the table should not be used, but if something slips through and uses one of them, it's much easier to tell if we remap to invalid value. As this is an internal check, I didn't want it in release mode. If this is deemed unnecessary or confusing I can remove it. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: assert() doesn't quite work here. I need to check that this entry in the table is not used in the next loop. I'd need to put assert in that loop, but by that time I have no easy way to tell which entries are bad, unless I mark them in the first loop. so I mark them under !NDEBUG. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > _PyCode_AddObj() should be added to Include/code.h Should it be declared as PyAPI_FUNC too? This will make it unnecessarily exported (when patch in Issue11410 is merged), so I wanted to avoid that. btw, that you for reviewing the patch. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: > Right now, the pattern is tokenize -> parse -> AST -> genbytecode -> > peephole optimization (which disassembles the bytecode, analyzed it > and rewrites it) -> final bytecode. The correct pattern is tokenize > -> parse -> AST -> optimize -> genbytecode -> peephole optimization > with minimal disassembly, analysis, and opcode rewrites -> final bytecode. Actually, optimizing on AST is not ideal too. Ideally you should convert it into a specialized IR, preferably in SSA form and with explicit control flow. Re size saving: I've ran make test with and without my patch and measured total size of all generated pyc files: without patch: 16_619_340 with patch: 16_467_867 So it's about 150KB or 1% of the size, not just a few bytes. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11471] If without else generates redundant jump
New submission from Eugene Toder : If statement without else part generates unnecessary JUMP_FORWARD insn with jumps right to the next insn: >>> def foo(x): if x: x = 1 >>> dis(foo) 2 0 LOAD_FAST0 (x) 3 POP_JUMP_IF_FALSE 15 6 LOAD_CONST 1 (1) 9 STORE_FAST 0 (x) 12 JUMP_FORWARD 0 (to 15) >> 15 LOAD_CONST 0 (None) 18 RETURN_VALUE This patch suppresses generation of this jump. Testing revealed another issue: when AST is produced from string empty 'orelse' sequences are represented with NULLs. However when AST is converted from Python AST objects empty 'orelse' is a pointer to 0-length sequence. I've changed this to produce NULL pointers, like in the string case. This uses less memory and doesn't introduce different code path in compiler. Without this change test_compile failed with my first change. make test passes. -- components: Interpreter Core files: if_no_else.patch keywords: patch messages: 130623 nosy: eltoder priority: normal severity: normal status: open title: If without else generates redundant jump type: performance versions: Python 3.3 Added file: http://bugs.python.org/file21091/if_no_else.patch ___ Python tracker <http://bugs.python.org/issue11471> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11471] If without else generates redundant jump
Eugene Toder added the comment: Test case (needed some refactoring to avoid duplication). -- Added file: http://bugs.python.org/file21092/if_no_else_test.patch ___ Python tracker <http://bugs.python.org/issue11471> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Thomas, can you clarify, does loading interns all constants in co_consts, or do you mean that they are mostly small numbers and the like? Without interning I think that in-memory size difference is even bigger than on-disk, as you get one entry in tuple and the object itself. I'm sure I can cook up a test that will show some perf difference, because of cache misses or paging. You can say that this is not real world code, and you will likely be right. But in real world (before you add inlining and constant propagation) constant folding doesn't make a lot of difference too, yet people asked for it and peepholer does it. Btw, Antoine just improved it quite a bit (Issue11244), so size difference with my patch should increase. My rationale for the patch is that 1) it's real very simple 2) it removes feeling of half-done job when you look at the bytecode. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Yes, my patch doesn't fix the regression, only a special case of -0. -- ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Alexander, my patch does 2 optimizations: doesn't insert a new constant if one already exist and removes unused constants after peephole is done. You patch seems to do only the latter. It's very similar, from a quick look at your patch: - My patch doesn't introduce any additional passes over the code (you added 2 passes). - It preserves doc string. - It's less code, because I reuse more of existing code. Feel free to look at the patch and tell me if you don't agree. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11462] Peephole creates duplicate and unused constants
Eugene Toder added the comment: Raymond, you can be assured that I'm not developing this patch, unless I'm told it has chances to be accepted. I don't like to waste my time. On a related note, your review of my other patches is appreciated. -- ___ Python tracker <http://bugs.python.org/issue11462> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6981] locale.getdefaultlocale() envvars default code and documentation mismatch
Eugene Crosser added the comment: I don't know if the solution suggested in the report is right, but I can confirm the the current logic of getdefaultlocale() produces wrong results. I have LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_CTYPE=ru_RU.UTF-8 LC_COLLATE=ru_RU.UTF-8 which means, according to the documentation, "Do everything in English, but recognize Russian words and sort according to Russian alphabet". All other software honors that semantics, except Python that returns Russian as the overall default locale: Python 2.7.1+ (r271:86832, Feb 24 2011, 15:00:15) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import locale >>> print locale.getdefaultlocale() ('ru_RU', 'UTF8') I believe that because LC_CTYPE controls only one specific aspect of the locale, it either should not be used at all, or used only as the last resort when locale cannot be determined from LANG or LANGUAGE. I think that the current search order "envvars=('LC_ALL', 'LC_CTYPE', 'LANG', 'LANGUAGE')" is wrong. -- nosy: +crosser ___ Python tracker <http://bugs.python.org/issue6981> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6981] locale.getdefaultlocale() envvars default code and documentation mismatch
Eugene Crosser added the comment: Steffen: can you please be more specific? As I read the seciton 8.2 of the cited document, I do not see a disparity with my statement. There is even an example: """ For example, if a user wanted to interact with the system in French, but required to sort German text files, LANG and LC_COLLATE could be defined as: LANG=Fr_FR LC_COLLATE=De_DE """ which is (almost) exactly my case. I have LANG set to en_US to tell the system that I want to interact in English, and LC_CTYPE - to Russian to tell it that "classification of characters" needs to be Russian-specific. Note that I do *not* have LC_ALL set, because it takes precedence over all other LC_* variables which is *not* what I want. I believe that the correct "guessing order", according to the document that you cited, would be: LANG LC_ALL then possibly (possibly because it does not have encoding info) LANGUAGE then optionally, as a last resort LC_CTYPE and other LC_* variables. -- ___ Python tracker <http://bugs.python.org/issue6981> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11510] Peephole breaks set unpacking
New submission from Eugene Toder : Since the time 'x in set' optimization was added to peephole it has the following bug with set unpacking: >>> def foo(x,y): a,b = {x,y} return a,b >>> dis(foo) 2 0 LOAD_FAST0 (x) 3 LOAD_FAST1 (y) 6 ROT_TWO 7 STORE_FAST 2 (a) 10 STORE_FAST 3 (b) 3 13 LOAD_FAST2 (a) 16 LOAD_FAST3 (b) 19 BUILD_TUPLE 2 22 RETURN_VALUE That is, unpacking of literal set of sizes 2 and 3 is changed to ROT instructions. This, however, changes the semantics, as construction of set would eliminate duplicates. The difference can be demonstrated like this: Python 3.1 >>> foo(1,1) Traceback (most recent call last): File "", line 1, in File "", line 2, in foo ValueError: need more than 1 value to unpack Python 3.2 >>> foo(1,1) (1, 1) -- components: Interpreter Core messages: 130917 nosy: eltoder priority: normal severity: normal status: open title: Peephole breaks set unpacking type: compile error versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue11510> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
New submission from Eugene Toder : As pointed out by Raymond, constant folding should be done on AST rather than on generated bytecode. Here's a patch to do that. It's rather long, so overview first. The patch replaces existing peephole pass with a folding pass on AST and a few changes in compiler. Feature-wise it should be on par with old peepholer, applying optimizations more consistently and thoroughly, but maybe missing some others. It passes all peepholer tests (though they seem not very extensive) and 'make test', but more testing is, of course, needed. I've split it in 5 pieces for easier reviewing, but these are not 5 separate patches, they should all be applied together. I can upload it somewhere for review or split it in other ways, let me know. Also, patches are versus 1e00b161f5f5, I will redo against head. TOC: 1. Changes to AST 2. Folding pass 3. Changes to compiler 4. Generated files (since they're checked in) 5. Tests In detail: 1. I needed to make some changes to AST to enable constant folding. These are a) Merge Num, Str, Bytes and Ellipsis constructors into a single Lit (literal) that can hold anything. For one thing, this removes a good deal of copy-paste in the code, since these are always treated the same. (There were a couple of places where Bytes ctor was missing for no apparent reason, I think it was forgotten.) Otherwise, I would have to introduce at least 4 more node types: None, Bool, TupleConst, SetConst. This seemed excessive. b) Docstring is now an attribute of Module, FunctionDef and ClassDef, rather than a first statement. Docstring is a special syntactic construction, it's not an executable code, so it makes sense to separate it. Otherwise, optimizer would have to take extra care not to introduce, change or remove docstring. For example: def foo(): "doc" + "string" Without optimizations foo doesn't have a docstring. After folding, however, the first statement in foo is a string literal. This means that docstring depends on the level of optimizations. Making it an attribute avoids the problem. c) 'None', 'True' and 'False' are parsed as literals instead of names, even without optimizations. Since they are not redefineable, I think it makes most sense to treat them as literals. This isn't strictly needed for folding, and I haven't removed all the artefacts, in case this turns out controversial. 2. Constant folding (and a couple of other tweaks) is performed by a visitor. The visitor is auto-generated from ASDL and a C template. C template (Python/ast_opt.ct) provides code for optimizations and rules on how to call it. Parser/asdl_ct.py takes this and ASDL and generates a visitor, that visits only nodes which have associated rules (but visits them via all paths). The code for optimizations itself is pretty straight-forward. The generator can probably be used for symtable.c too, removing ~200 tedious lines of code. 3. Changes to compiler are in 3 categories a) Updates for AST changes. b) Changes to generate better code and not need any optimizations. This includes tuple unpacking optimization and if/while conditions. c) Simple peephole pass on compiler internal structures. This is a better form for doing this, than a bytecode. The pass only deals with jumps to jumps/returns and trivial dead code. I've also made 'raise' recognized as a terminator, so that 'return None' is not inserted after it. 4, 5. No big surprises here. -- components: Interpreter Core messages: 130955 nosy: eltoder priority: normal severity: normal status: open title: Rewrite peephole to work on AST versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : -- keywords: +patch Added file: http://bugs.python.org/file21198/0_ast.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21199/0_fold.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21200/0_compile.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21201/0_generated.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : Added file: http://bugs.python.org/file21202/0_tests.patch ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Changes by Eugene Toder : -- nosy: +pitrou, rhettinger ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11244] Negative tuple elements produce inefficient code.
Eugene Toder added the comment: Is anyone reviewing my patch? It's just 1 line long. Should it be moved to another issue? Though, technically, it's needed to address the regression in question: Python 3.1 folds -0, the current code still does not. -- ___ Python tracker <http://bugs.python.org/issue11244> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Because I don't know how to make them. Any pointers? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Thanks. Review link: http://codereview.appspot.com/4281051 -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: I see. Should I attach diffs vs. some revision from hg.python.org? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Any comments on the code so far or suggestions on how we should move forward? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: I've updated patches on Rietveld with some small changes. This includes better code generation for boolops used outside of conditions and cleaned up optimize_jumps(). This is probably the last change before I get some feedback. Also, I forgot to mention yesterday, patches on Rietveld are vs. ab45c4d0b6ef -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Just for fun I've run pystones. W/o my changes it averages to about 70k, with my changes to about 72.5k. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: AFAICT my patch has everything that #1346238 has, except BoolOps, which can be easily added (there's a TODO). I don't want to add any new code, though, until the current patch will get reviewed -- adding code will only make reviewing harder. #10399 looks interesting, I will take a look. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Is anyone looking or planing to look at the patch? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Thanks. > string concatenation will now work, and errors like "'hello' - 'world'" > should give a more informative TypeError Yes, 'x'*5 works too. > Bikeshed: We use Attribute rather than Attr for that node type, > perhaps the full "Literal" name would be better Lit seemed more in line with Num, Str, BinOp etc. No reason it can't be changed, of course. > Lib/test/disutil.py should really be made a feature of the dis module > itself Agreed, but I didn't want to widen the scope of the patch. If this is something that can be reviewed quickly, I can make a change to dis. I'd add a keyword-only arg to dis and disassembly -- an output stream defaulting to stdout. dis_to_str then passes StringIO and returns the string. Sounds OK? > Since the disassembly is interpreter specific, the new disassembly > tests really shouldn't go directly in test_compile.py. A separate > "test_ast_optimiser" file would be easier for alternate > implementations to skip over. New tests in test_compiler are not for the AST pass, but for changes to compile.c. I can split them out, how about test_compiler_opts? > I'd like to see a written explanation for the first few changes in > test_peepholer.py Sure. 1) not x == 2 can be theoretically optimized to x != 2, while this test is for another optimization. not x is more robust. 2) Expression statement which is just a literal doesn't produce any code at all. This is now true for None/True/False as well. To preserve constants in the output I've put them in tuples. > When you get around to rebasing the patch on 3.3 trunk, don't forget > to drop any unneeded "from __future__" imports. If you're referring to asdl_ct.py, that's actually an interesting question. asdl_ct.py is run by system installed python2, for obvious reasons. What is the policy here -- what is the minimum version of system python that should be sufficient to build python3? I tested my code on 2.6 and 3.1, and with __future__ it should work on 2.5 as well. Is this OK or should I drop 'with' so it runs on 2.4? > The generated code for the Lit node type looks wrong: it sets v to > Py_None, then immediately checks to see if v is NULL again. Right, comment in asdl_c.py says: # XXX: special hack for Lit. Lit value can be None and it # should be stored as Py_None, not as NULL. If there's a general agreement on Lit I can certainly clean this up. > Don't use "string" as a C type - use "char *" (and "char **" instead > of "string *"). string is a typedef for PyObject that ASDL uses. I don't think I have a choice to not use it. Can you point to a specific place where char* could be used? > There should be a new compiler flag to skip the AST optimisation step. There's already an 'optimizations level' flag. Maybe we should make it more meaningful rather than multiplying the number of flags? > A bunch of the compiler internal changes seem to make the basic flow > of the generated assembly not match the incoming source code. Can you give an example of what you mean? The changes are basically 1) standard way of handling conditions in simple compilers 2) peephole. > It doesn't seem like a good idea to conflate these with the AST > optimisation patch. If that means leaving the peepholer in to handle > them for now, that's OK - it's fine to just descope the peepholer > without removing it entirely. The reason why I think it makes sense to have this in a single change is testing. This allows to reuse all existing peephole tests. If I leave old peephole enabled there's no way to tell if my pass did something from disassembly. I can port tests to AST, but that seemed like more work than match old peepholer optimizations. Is there any opposition to doing simple optimizations on compiler structures? They seem a good fit for the job. In fact, if not for stack representation, I'd say that they are better IR for optimizer than AST. Also, can I get your opinion on making None/True/False into literals early on and getting rid of forbidden_name? Antoine, Georg -- I think Nick's question is not about AST changing after optimizations (this can indeed be a separate flag), but the structure of AST changing. E.g. collapsing of Num/Str/Bytes into Lit. Btw, if this is acceptable I'd make a couple more changes to make scope structure obvious from AST. This will allow auto-generating much of the symtable pass. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: > and with __future__ it should work on 2.5 as well. Actually, seems that at least str.format is not in 2.5 as well. Still the question is should I make it run on 2.5 or 2.4 or is 2.6 OK (then __future__ can be removed). -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: > I don't think it can: That already doesn't work in dict and set (eq not consistent with hash), I don't think it's a big problem if that stops working in some other cases. Anyway, I said "theoretically" -- maybe after some conservative type inference. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: Also, to avoid any confusion -- currently my patch only runs AST optimizations before code generation, so compile() with ast.PyCF_ONLY_AST returns non-optimized AST. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Rewrite peephole to work on AST
Eugene Toder added the comment: If we have to preserve backward compatibility of Python AST API, we can do this relatively easily (at the expense of some code complexity): * Add 'version' argument to compile() and ast.parse() with default value of 1 (old AST). Value 2 will correspond to the new AST. * Do not remove Num/Str/Bytes/Ellipsis Python classes. Make PyAST_obj2mod and PyAST_mod2obj do appropriate conversions when version is 1. * Possibly emit a PendingDeprecationWarning when version 1 is used with the goal of removing it in 3.5 Alternative implementation is to leave Num/Str/etc classes in C as well, and write visitors (similar to folding one) to convert AST between old and new forms. Does this sounds reasonable? Should this be posted to python-dev? Should I write a PEP (I'd need some help with this)? Are there any other big issues preventing this to be merged? -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
New submission from Eugene Toder : As discussed in Issue11549 a couple of tests need to inspect disassembly of some code. Currently they have to override sys.stdout, run dis and restore stdout back. It would be much nicer if dis module provided functions that return disassembly as a string. Provided is a patch that adds file argument to most dis functions, defaulting to sys.stdout. On top of that there are 2 new functions: dis_to_str and disassembly_to_str that return disassembly as a string instead of writing it to a file. -- components: Library (Lib) files: dis.diff keywords: patch messages: 133437 nosy: eltoder priority: normal severity: normal status: open title: Add functions to return disassembly as string type: feature request versions: Python 3.3 Added file: http://bugs.python.org/file21598/dis.diff ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : Removed file: http://bugs.python.org/file21598/dis.diff ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : Added file: http://bugs.python.org/file21599/dis.patch ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Changes by Eugene Toder : -- nosy: +ncoghlan ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Add functions to return disassembly as string
Eugene Toder added the comment: Agreed, but that would require rewriting of all tests in test_peepholer. -- ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5996] abstract class instantiable when subclassing dict
Eugene Toder added the comment: This patch fixes the problem by moving the check from object_new to PyType_GenericAlloc. The check is very cheap, so this should not be an issue. -- keywords: +patch nosy: +eltoder Added file: http://bugs.python.org/file21600/abc.patch ___ Python tracker <http://bugs.python.org/issue5996> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11816] Refactor the dis module to provide better building blocks for bytecode analysis
Eugene Toder added the comment: So in the near term, dis-based tests should continue to copy/paste sys.stdout redirection code? -- ___ Python tracker <http://bugs.python.org/issue11816> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
New submission from Eugene Toder : A comment in the definition of PyCodeObject in Include/code.h says: /* The rest doesn't count for hash or comparisons */ which, I think, makes a lot of sense. The implementation doesn't follow this comment, though. code_hash actually includes co_name and code_richcompare includes co_name and co_firstlineno. This makes hash and comparison inconsistent with each other and with the comment. -- components: Interpreter Core messages: 135015 nosy: eltoder priority: normal severity: normal status: open title: Inconsistent hash and comparison for code objects type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: I would propose changing implementation to match the comment. At a minimum, remove co_firstlineno comparison. As the last resort, at least change the comment. -- ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: It appears that * co_name was added to hash and cmp in this check-in by Guido: http://hg.python.org/cpython-fullhistory/diff/525b2358721e/Python/compile.c I think the reason was to preserve function name when defining multiple functions with the same code in one function or module. (By default function gets it's name from code, but only one code object will be preserved, since all constants in a function are stored in a dict during compilation). * co_firstlineno was added to cmp (but not hash) in this check-in by Brett: http://hg.python.org/cpython-fullhistory/rev/8127a55a57cb In an attempt to fix this bug: http://www.mail-archive.com/python-bugs-list@python.org/msg02440.html It doesn't actually fix the bug and makes hash inconsistent with cmp. I'm not convinced that the bug is valid -- why would you care if identical lambdas share or not share the code? Both changes seem come from a tension between code object's original intention to compare "functionally equivalent" codes equal and side-effects of constants de-duplication in a function (loss of function name, broken breakpoints and line numbers in a debugger). I can think of 2 ways to address this. Change hash/cmp back to ignore co_name and co_firstlineno and then: 1) Never dedup codes, or only dedup codes with the same co_firstlineno (can compare co_name too, but I can't think of a way to create 2 named funcs on the same line). This is pretty much what the current code does. 2) Separate "debug information" (co_filename, co_name, co_firstlineno, co_lnotab) from code object into a separate object. Construct a function from both objects. Allow de-duplication of both. This will always preserve all information in a function, but allows to share code object between identical functions. This is a much more intrusive change, though, e.g. frame will need a reference to debug information. -- nosy: +Jeremy.Hylton, brett.cannon, ncoghlan ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11983] Inconsistent hash and comparison for code objects
Eugene Toder added the comment: Btw, disabling dedup for codes won't be the first exception -- we already avoid coalescing -0.0 with 0.0 for float and complex, even though they compare equal. -- ___ Python tracker <http://bugs.python.org/issue11983> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11262] re.sub replaces only first 32 matches with re.U flag
New submission from Eugene Morozov : There's a peculiar and difficult to find bug in the re.sub method. Try following example: >>> text = 'X'*4096 >>> nt = re.sub(u"XX", u".", text, re.U) >>> nt u'XXX' (only 32 dots, the rest of the string is not changed). If I first compile regexp, and then perform compiled_regexp.sub, everything seems to work correctly. -- components: Regular Expressions messages: 128923 nosy: Eugene.Morozov priority: normal severity: normal status: open title: re.sub replaces only first 32 matches with re.U flag type: security versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue11262> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8075] Windows (Vista/7) install error when choosing to compile .py files
New submission from Eugene Baranov : I tried installing Python 2.6.4 into Program Files in Windows 7 and choose to compile .py files after install. Installer correctly asks for a elevation and copies all fixes but it looks like compile batch are being started from initial, unelevated context. (It displays access denied messages) The issues does not appear when starting installer from already elevated command prompt. -- components: Installation messages: 100504 nosy: Regent severity: normal status: open title: Windows (Vista/7) install error when choosing to compile .py files versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue8075> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8401] Strange behavior of bytearray slice assignment
New submission from Eugene Kapun : >>> a = bytearray() >>> a[:] = 0 # Is it a feature? >>> a bytearray(b'') >>> a[:] = 10 # If so, why not document it? >>> a bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00') >>> a[:] = -1 Traceback (most recent call last): File "", line 1, in ValueError: negative count >>> a[:] = -10 # This should raise ValueError, not >>> TypeError. Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> a[:] = 100 Traceback (most recent call last): File "", line 1, in MemoryError >>> a[:] = 10 # This should raise OverflowError, not >>> TypeError. Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable >>> a[:] = [] # Are some empty sequences better than others? >>> a[:] = () >>> a[:] = list("") >>> a[:] = "" Traceback (most recent call last): File "", line 1, in TypeError: string argument without an encoding -- components: Interpreter Core messages: 103133 nosy: abacabadabacaba severity: normal status: open title: Strange behavior of bytearray slice assignment type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8401> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8401] Strange behavior of bytearray slice assignment
Eugene Kapun added the comment: Empty string is an iterable of integers in the range 0 <= x < 256, so it should be allowed. >>> all(isinstance(x, int) and 0 <= x < 256 for x in "") True >>> bytearray()[:] = "" Traceback (most recent call last): File "", line 1, in TypeError: string argument without an encoding -- ___ Python tracker <http://bugs.python.org/issue8401> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8417] bytes and bytearray constructors raise TypeError for very large ints
New submission from Eugene Kapun : >>> bytes(10) b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' >>> bytes(10 ** 100) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable -- components: Interpreter Core messages: 103300 nosy: abacabadabacaba severity: normal status: open title: bytes and bytearray constructors raise TypeError for very large ints type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8417> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1766304] improve xrange.__contains__
Changes by Eugene Kapun : -- nosy: +abacabadabacaba ___ Python tracker <http://bugs.python.org/issue1766304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8401] Strange behavior of bytearray slice assignment
Eugene Kapun added the comment: > Not really, chars are not ints Yes, however, empty string contains exactly zero chars. > and anyway the empty string fall in the first case. Strings aren't mentioned in documentation of bytearray slice assignment. However, I think that bytearray constructor should accept empty string too, without an encoding, for consistency. -- ___ Python tracker <http://bugs.python.org/issue8401> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8401] Strange behavior of bytearray slice assignment
Eugene Kapun added the comment: __doc__ of bytearray says: > bytearray(iterable_of_ints) -> bytearray > bytearray(string, encoding[, errors]) -> bytearray > bytearray(bytes_or_bytearray) -> mutable copy of bytes_or_bytearray > bytearray(memory_view) -> bytearray So, unless an encoding is given, empty string should be interpreted as an iterable of ints. BTW, documentation and docstring should be made consistent with each other. -- ___ Python tracker <http://bugs.python.org/issue8401> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] set_lookkey is unsafe
New submission from Eugene Kapun : I've noticed that set_lookkey (in Objects/setobject.c) does some unsafe things: Objects/setobject.c: > if (entry->hash == hash) { > startkey = entry->key; > Py_INCREF(startkey); > cmp = PyObject_RichCompareBool(startkey, key, Py_EQ); > Py_DECREF(startkey); At this point, object pointed to by startkey could be deallocated, and then new object may be allocated at the same address. > if (cmp < 0) > return NULL; > if (table == so->table && entry->key == startkey) { At this point, the table may be reallocated at the same address but with different (possibly smaller) size, so entry->key may be in deallocated memory. Also, entry->key may be equal to startkey but still point to an object other than one key was compared with. > if (cmp > 0) > return entry; > } > else { > /* The compare did major nasty stuff to the >* set: start over. >*/ > return set_lookkey(so, key, hash); This can lead to infinite recursion. > } -- components: Interpreter Core messages: 10 nosy: abacabadabacaba severity: normal status: open title: set_lookkey is unsafe versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8425] a -= b should be fast if a is a small set and b is a large set
New submission from Eugene Kapun : >>> small_set = set(range(2000)) >>> large_set = set(range(2000)) >>> large_set -= small_set # Fast >>> small_set -= large_set # Slow, should be fast >>> small_set = small_set - large_set # Fast -- components: Interpreter Core messages: 103343 nosy: abacabadabacaba severity: normal status: open title: a -= b should be fast if a is a small set and b is a large set type: resource usage versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8425> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8401] Strange behavior of bytearray slice assignment
Eugene Kapun added the comment: -1 on special-casing string without an encoding. Current code does (almost) this: ... if argument_is_a_string: if not encoding_is_given: # Special case raise TypeError("string argument without an encoding") encode_argument() return if encoding_is_given: raise TypeError("encoding or errors without a string argument") ... IMO, it should do this instead: ... if encoding_is_given: if not argument_is_a_string: raise TypeError("encoding or errors without a string argument") encode_argument() return ... This way, bytearray("") would work without any special cases. -- ___ Python tracker <http://bugs.python.org/issue8401> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Eugene Kapun added the comment: I've found more unsafe code in Objects/setobject.c. This code makes Python 3.1.2 segfault by using a bug in function set_merge: class bad: def __eq__(self, other): if be_bad: set2.clear() raise Exception return self is other def __hash__(self): return 0 be_bad = False set1 = {bad()} set2 = {bad() for i in range(2000)} be_bad = True set1.update(set2) Function set_symmetric_difference_update has a similar bug. Another bug in set_symmetric_difference_update: class bad: def __init__(self): print("Creating", id(self)) def __del__(self): print("Deleting", id(self)) def __eq__(self, other): print("Comparing", id(self), "and", id(other)) if be_bad: dict2.clear() return self is other def __hash__(self): return 0 be_bad = False set1 = {bad()} dict2 = {bad(): None} be_bad = True set1.symmetric_difference_update(dict2) -- title: set_lookkey is unsafe -> Objects/setobject.c contains unsafe code ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8435] It is possible to observe a mutating frozenset
New submission from Eugene Kapun : This code shows that frozensets aren't really immutable. The same frozenset is printed twice, with different content. Buggy functions are set_contains, set_remove and set_discard, all in Objects/setobject.c class bad: def __eq__(self, other): global f2 f2 = other print_f2() s1.add("querty") return self is other def __hash__(self): return hash(f1) def print_f2(): print(id(f2), repr(f2)) f1 = frozenset((1, 2, 3)) s1 = set(f1) s1 in {bad()} print_f2() -- components: Interpreter Core messages: 103426 nosy: abacabadabacaba severity: normal status: open title: It is possible to observe a mutating frozenset type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8435> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Changes by Eugene Kapun : -- type: -> crash ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8436] set.__init__ accepts keyword args
New submission from Eugene Kapun : >>> list().__init__(a=0) Traceback (most recent call last): File "", line 1, in TypeError: 'a' is an invalid keyword argument for this function >>> set().__init__(a=0) -- components: Interpreter Core messages: 103427 nosy: abacabadabacaba severity: normal status: open title: set.__init__ accepts keyword args type: behavior versions: Python 3.1 ___ Python tracker <http://bugs.python.org/issue8436> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Eugene Kapun added the comment: This patch still assumes that if so->table didn't change then the table wasn't reallocated (see http://en.wikipedia.org/wiki/ABA_problem). One solution is to check that so->mask didn't change as well. Also, checking that refcnt > 1 is redundant because if entry->key == startkey then there are at least two references: one from entry->key and another from startkey. These functions have a bug that may cause them to refer to deallocated memory when both arguments are sets: set_intersection, set_isdisjoint, set_difference_update_internal, set_difference, set_symmetric_difference_update, set_issubset. These functions may also do the same if the first argument is a set and the second argument is a dict: set_difference, set_symmetric_difference_update. Bugs in set_repr: > keys = PySequence_List((PyObject *)so); > if (keys == NULL) > goto done; > > listrepr = PyObject_Repr(keys); > Py_DECREF(keys); List pointed to by keys is already deallocated at this point. > if (listrepr == NULL) { > Py_DECREF(keys); But this code tries to DECREF it. > goto done; > } > newsize = PyUnicode_GET_SIZE(listrepr); > result = PyUnicode_FromUnicode(NULL, newsize); > if (result) { > u = PyUnicode_AS_UNICODE(result); > *u++ = '{'; > /* Omit the brackets from the listrepr */ > Py_UNICODE_COPY(u, PyUnicode_AS_UNICODE(listrepr)+1, > PyUnicode_GET_SIZE(listrepr)-2); > u += newsize-2; > *u++ = '}'; > } > Py_DECREF(listrepr); > if (Py_TYPE(so) != &PySet_Type) { result may be NULL here. > PyObject *tmp = PyUnicode_FromFormat("%s(%U)", >Py_TYPE(so)->tp_name, >result); I think PyUnicode_FromFormat won't like it. > Py_DECREF(result); > result = tmp; > } -- ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Eugene Kapun added the comment: > > One solution is to check that so->mask didn't > > change as well. > > I saw that and agree it would make a tighter check, but haven't convinced > myself that it is necessary. Without this check, it is possible that the comparison shrinks the table, so entry becomes out of bounds. However, if both so->table and so->mask didn't change then entry is still a pointer to one of table elements so it can be used safely. > > Also, checking that refcnt > 1 is redundant > > because if entry->key == startkey then there > > are at least two references: one from entry->key > > and another from startkey. > > It is a meaningful check. We have our own INCREF > and one for the key being in the table. If the > count is 1, then it means that the comparison > check deleted the key from the table or replaced > its value (see issue 1517). If the comparison deleted or changed the key then the check entry->key == startkey would fail so refcnt check won't be reached. Checking refcounts is also bad because someone else may have references to the key. > I don't follow why you think keys is already deallocated. > When assigned by PySequence_List() without a NULL return, the refcnt is one. > The call to PyObject_Repr(keys) does not change the refcnt of keys, > so the Py_DECREF(keys) is correct. Look at the code again. If listrepr happens to be NULL, you do Py_DECREF(keys) twice (this bug is only present in py3k branch). -- ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Eugene Kapun added the comment: This code crashes python by using another bug in set_repr. This only affects py3k. This code relies on out-of-memory condition, so run it like: $ (ulimit -v 65536 && python3 test.py) Otherwise, it will eat all your free memory before crashing. val = "a" * 1 class big: def __repr__(self): return val i = 16 while True: repr(frozenset(big() for j in range(i))) i = (i * 5) >> 2 -- ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8420] Objects/setobject.c contains unsafe code
Changes by Eugene Kapun : Added file: http://bugs.python.org/file16981/repr.diff ___ Python tracker <http://bugs.python.org/issue8420> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44018] Bug in random.seed
New submission from Eugene Rossokha : https://github.com/python/cpython/blob/master/Lib/random.py#L157 If bytearray is passed as a seed, the function will change it. It either has to be documented, or the implementation should change. -- components: Library (Lib) messages: 392782 nosy: arjaz priority: normal severity: normal status: open title: Bug in random.seed versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue44018> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44018] Bug in random.seed
Eugene Rossokha added the comment: I find the following behaviour very confusing: >>> import random >>> a = bytearray("1234", "utf-8") >>> b = bytearray("1234", "utf-8") >>> a == b True >>> random.seed(a) >>> a == b False >>> a bytearray(b'1234\xd4\x04U\x9f`.\xabo\xd6\x02\xacv\x80\xda\xcb\xfa\xad\xd1603^\x95\x1f\tz\xf3\x90\x0e\x9d\xe1v\xb6\xdb(Q/.\x00\x0b\x9d\x04\xfb\xa5\x13>\x8b\x1cn\x8d\xf5\x9d\xb3\xa8\xab\x9d`\xbeK\x97\xcc\x9e\x81\xdb') >>> b bytearray(b'1234') The function doesn't document it will change the input argument -- ___ Python tracker <https://bugs.python.org/issue44018> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
Change by Eugene Toder : -- nosy: +brett.cannon ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
Change by Eugene Toder : -- nosy: +ncoghlan, vstinner ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
New submission from Eugene Toder : It's convenient to use @lru_cache on functions with no arguments to delay doing some work until the first time it is needed. Since @lru_cache is implemented in C, it is already faster than manually caching in a closure variable. However, it can be made even faster and more memory efficient by not using the dict at all and caching just the one result that the function returns. Here are my timing results. Before my changes: $ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()" 500 loops, best of 5: 42.2 nsec per loop $ ./python -m timeit -s "import functools; f = functools.lru_cache(None)(lambda: 1)" "f()" 500 loops, best of 5: 38.9 nsec per loop After my changes: $ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()" 1000 loops, best of 5: 22.6 nsec per loop So we get improvement of about 80% compared to the default maxsize and about 70% compared to maxsize=None. -- components: Library (Lib) messages: 384883 nosy: eltoder, serhiy.storchaka, vstinner priority: normal severity: normal status: open title: optimize lru_cache for functions with no arguments type: performance versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: As you can see in my original post, the difference between @cache (aka @lru_cache(None)) and just @lru_cache() is negligible in this case. The optimization in this PR makes a much bigger difference. At the expense of some lines of code, that's true. Also, function calls in Python are quite slow, so being faster than a function call is not a high bar. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: Ammar, thank you for the link. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42903] optimize lru_cache for functions with no arguments
Eugene Toder added the comment: @cache does not address the problem or any of the concerns brought up in the thread. Thread-safe @once is a nice idea, but more work of course. -- ___ Python tracker <https://bugs.python.org/issue42903> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40962] Add documentation for asyncio._set_running_loop()
New submission from Eugene Huang : In the pull request https://github.com/python/asyncio/pull/452#issue-92245081 the linked comment says that `asyncio._set_running_loop()` is part of the public asyncio API and will be documented, but I couldn't find any references to this function in https://docs.python.org/3/library/asyncio-eventloop.html or anywhere else (tried quick searching for it) in the docs. -- assignee: docs@python components: Documentation messages: 371387 nosy: docs@python, eugenhu priority: normal severity: normal status: open title: Add documentation for asyncio._set_running_loop() versions: Python 3.10, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue40962> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42125] linecache cannot get source for the __main__ module with a custom loader
New submission from Eugene Toder : If a module has a loader, linecache calls its get_source() passing __name__ as the argument. This works most of the time, except that the __main__ module has it set to "__main__", which is commonly not the real name of the module. Luckily, we now have __spec__ which has the real name, so we can just use it. Attached zip file reproduces the problem: $ python t.zip Traceback (most recent call last): ... File "t.zip/t.py", line 11, in File "t.zip/t.py", line 8, in f File "t.zip/t.py", line 8, in f File "t.zip/t.py", line 8, in f [Previous line repeated 2 more times] File "t.zip/t.py", line 7, in f ValueError Note that entries from t.py don't have source code lines. -- components: Library (Lib) files: t.zip messages: 379408 nosy: eltoder priority: normal severity: normal status: open title: linecache cannot get source for the __main__ module with a custom loader type: behavior versions: Python 3.10 Added file: https://bugs.python.org/file49536/t.zip ___ Python tracker <https://bugs.python.org/issue42125> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1065427] sre_parse group limit check missing with 'python -O'
Eugene Voytitsky added the comment: Hi all, does someone can answer the questions asked by Keith Briggs in 2004: > What is the reason for this limit? Can it easily be removed? > It is causing me many problems. I also stuck into the problem with this limitation. Details you can read here - https://groups.google.com/forum/#!topic/ply-hack/iQLnZpTL1GA[1-25] Thanks. -- nosy: +Eugene.Voytitsky ___ Python tracker <http://bugs.python.org/issue1065427> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: > Method calls on literals are always fair game, though (e.g. you could > optimise "a b c".split()) What about optimizations that do not change behavior, except for different error messages? E.g. we can change y = [1,2][x] to y = (1,2)[x] where the tuple is constant and is stored in co_consts. This will, however, produce a different text in the exception when x is not 0 or 1. The type of exception is going to be the same. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11549] Build-out an AST optimizer, moving some functionality out of the peephole optimizer
Eugene Toder added the comment: If I'm not missing something, changing x in [1,2] to x in (1,2) and x in {1,2} to x in frozenset([1,2]) does not change any error messages. Agreed that without dynamic compilation we can pretty much only track literals (including functions and lambdas) assigned to local variables. -- ___ Python tracker <http://bugs.python.org/issue11549> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com