[issue36551] Optimize list comprehensions with preallocate size and protect against overflow

anthony shaw Tue, 09 Apr 2019 00:04:33 -0700


anthony shaw <anthony.p.s...@gmail.com> added the comment:


>I am -1 for this optimization because it affects only one particular case 
>(neither other kinds of comprehensions, nor generator expressions, nor list 
>comprehensions with 
 conditions) and even in this case it is small. It is possible to add a lot of 
other optimizations for other cases which will sped up them to 50% or 100%, but 
we do not do 
 this, because every such optimization has a cost. It increases the amount of 
code which should be maintained and covered by tests, it adds small overhead in 
common cases to 
 speed up an uncommon case, and increasing the code base can negatively affect 
surrounding code (just because the CPU cache and registers are used 
inappropriate and the 
 compiler optimizes less important paths).

Understood, I had hoped this change would have a broader impact. The additional 
opcode is not ideal either. 

> In addition, while this change speed up list comprehensions for long list, it 
> slows down them for short lists. Short lists are more common.

I've been profiling this today, basically, this implementation receives the 
`list_iter`, `range_iter`, etc.

There is no standard object model for an iterator's length, _PyObject_HasLen 
would return false because it neither implements tp_as_sequence nor, 
tp_as_mapping (rightly so).

What this has uncovered (so hopefully there's some value from this whole 
experience!) is that __length_hint__ for iterators is _really_ inefficient. 
Take a list_iterator for example:

PyObject_LengthHint will call, _PyObject_HasLen, which returns false, which 
then goes to call
_PyObject_LookupSpecial, then 
_PyObject_CallNoArg, which calls 
listiter_len, which calls 
PyList_GET_SIZE which returns a Py_ssize_t, which is then converted to a PyLong 
via
PyLong_FromSsize_t, which is then returned back to PyObject_LengthHint, which 
then
PyLong_AsSsize_t is run to convert the PyLong back into a Py_ssize_t

The Py_ssize_t is then finally returned to the caller!

My conclusion was that the list comprehension should be initialized to the 
length of the target, before GET_ITER is run. This would remove the overhead 
for range objects, because you could simply call _PyObject_HasLen, which would 
return true for dict, list, tuple and set, but false for range objects (which 
is what you want).

The issue is that GET_ITER is called outside the code object for the 
comprehension, so you'd have to pass an additional argument to the 
comprehension generator.

This is way outside of my expertise, but the only way I can see to find a 
sizeable benefit, with minimal code and no edge cases which are slower.

Thanks for your time

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36551>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36551] Optimize list comprehensions with preallocate size and protect against overflow

Reply via email to