New submission from anthony shaw <anthony.p.s...@gmail.com>:

List comprehensions currently create a series of opcodes inside a code object, 
the first of which is BUILD_LIST with an oparg of 0, effectively creating a 
zero-length list with a preallocated size of 0.

If you're doing a simple list comprehension on an iterator, e.g.

def foo():
  a = iterable
  return [x for x in a]

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

The list comprehension will do a list_resize on the 4, 8, 16, 25, 35, 46, 58, 
72, 88th iterations, etc.

This PR preallocates the list created in a list comprehension to the length of 
the iterator using PyObject_LengthHint().

It uses a new BUILD_LIST_PREALLOC opcode which builds a list with the allocated 
size of PyObject_LengthHint(co_varnames[oparg]).

[x for x in iterable] compiles to:

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST_PREALLOC      0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

If the comprehension has ifs, then it will use the existing BUILD_LIST opcode

Testing using a range length of 10000

./python.exe -m timeit "x=list(range(10000)); [y for y in x]"

Gives 392us on the current 3.8 branch
and 372us with this change (about 8-10% faster)

the longer the iterable, the bigger the impact.

This change also catches the issue that a very large iterator, like a range 
object :
[a for a in range(2**256)]

Would cause the 3.8< interpreter to consume all memory and crash because there 
is no check against PY_SSIZE_MAX currently.

With this change (assuming there is no if inside the comprehension) is now 
caught and thrown as an OverflowError:

>>> [a for a in range(2**256)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
OverflowError: Python int too large to convert to C ssize_t

----------
components: Interpreter Core
messages: 339586
nosy: anthony shaw, ncoghlan
priority: normal
severity: normal
status: open
title: Optimize list comprehensions with preallocate size and protect against 
overflow
versions: Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36551>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to