Bugs item #1085744, was opened at 2004-12-15 07:15 Message generated for change (Comment added) made by rhettinger You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1085744&group_id=5470
Category: Python Interpreter Core Group: Python 2.5 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Nick Coghlan (ncoghlan) Assigned to: Raymond Hettinger (rhettinger) Summary: PySequence_Tuple not as fast as PySequence_List Initial Comment: A little "my code is faster than your code" game on python-list showed up a pathological interaction between itertools.chain and str.join: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(chain(*data))" 10 loops, best of 3: 1.2 sec per loop ****** OUCH!! Some extra experiments show that creating a list from the result of chain is fast, but creating a tuple is horrendously slow: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(list(chain(*data)))" 10 loops, best of 3: 107 msec per loop C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(tuple(chain(*data)))" 10 loops, best of 3: 1.2 sec per loop Creating the tuple by means of a list is actually faster than creating the tuple directly: C:\>python -m timeit -s "data = [map(str, range(x)) for x in range(1000)]; from itertools import chain" "''.join(tuple(list(chain(*data))))" 10 loops, best of 3: 146 msec per loop A check with imap suggests the problem may be a more general interaction between PySequence_Fast and iterators which don't have __len__ methods: C:\>python -m timeit -s "data = [y for x in range(1000) for y in map(str, range(x))]; from itertools import imap; val = lambda arg: arg" "''.join(list(imap(val, data)))" 10 loops, best of 3: 310 msec per loop C:\>python -m timeit -s "data = [y for x in range(1000) for y in map(str, range(x))]; from itertools import imap; val = lambda arg: arg" "''.join(imap(val, data))" 10 loops, best of 3: 1.4 sec per loop Looking at the code supports that, too - PySequence_Fast uses PySequence_Tuple, which is great when PyObject_Size gives a nice answer, but can lead to a lot of tuple resizing when it isn't (one resize per 10 items beyond 10 up to 500, then one resize per 100 items beyond 500). 2.4's optimised list extension means that this *really* hurts performance wise. The other aspect is whether or not some of the utilities in itertools could benefit from a __len__ method that returned a sensible answer if their inputs defined __len__ methods, and returned -1 otherwise (this would certainly work well with PySequence_Tuple, but I realise it's a slightly odd behaviour for a __len__ method). ---------------------------------------------------------------------- >Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-16 05:39 Message: Logged In: YES user_id=80475 Loaded the patch. See Objects/abstract.c 2.133 ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-15 16:27 Message: Logged In: YES user_id=80475 Attaching a small patch for PySequence_Tuple. Try it out and let me know if you find it to be an improvement. It also adds a bit of error checking that should have been there anyway. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-15 12:14 Message: Logged In: YES user_id=80475 P.S. In your example, you can get an immediate improvement in performance by having Py_SequenceFast use PySequence_List instead of PySequence_Tuple. If you want to work on some Py2.5 optimizations along these lines, look at replacing the somewhat arbitrary over-allocation strategy in PySequence_Tuple. If you're super-motivated, see if you can improve on the algorithm for str.join. Since these are just strategy changes, they will improve some cases at the expense of others. The goal is to find the one that does the best, most of the time; not horribly in worst situations; and makes the fewest demands on memory. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-15 10:04 Message: Logged In: YES user_id=80475 Remain calm ;-) Also, please isolate the issues instead of clumping everything together. And this is likely best handled by email rather than SF (you're asking me to explain all of your timings rather than pointing to a buggy piece of code). The idea (feature request) to make iterators length transparent has already been explored and reached a dead end. I've implemented it already it the few situations where it can work (for example itertools.repeat(obj, n) and dict.iteritems() report their length). Most other situations run into logical inconsistencies due to mutable iterables being indistinguishable from iterators over those iterables. See Lib/test/test_iterlen.py for the gory details. Writing f(*it) is also its own issue -- for uses other than argument passing, Guido considers it an abuse (since * unwinds its components on to ceval's stack). Likewise, there is nothing unique to itertools here. All of the timings can be shown to have similar results if generators are used instead of itertools. This issue is really how consumers handle unsized iterable inputs. You cover three different consumers, ''.join(it), list(it), and tuple(it) which all take different approaches to unsized iterable inputs. So, it is no surprise that the three have different performance characteristics. str.join() is its own little bundle of joy whose behavior is dictated by its need to make multiple passes over the input. Improving its handling of unsized iterable inputs is a thorny subject. You're welcome to post a patch. The best way to analyze what is going on is to disregard the timings and instead draw little diagrams of what is memory at any given time. Also, draw a data migration path -- you want the algorithm to move the elements as few times as possible. Be sure to handle cases like dict.iteritems() which know their length but are not a list or tuple. Also, handle cases where the inputs are created on the fly (not already held in memory as in your example): ''.join(str(i%10) for i in xrange(n)). FYI, there was an interesting bug report on this subject a year ago. IIRC, the OP saw that memory consumption was greater with an iterable input than if the iterable had been first wrapped in list(). I don't see anything out of whack for the creation of lists and tuples from unsized iterables. There are simply two different overallocation algorithms; consequently, there will always be situations where one outperforms the other. For your giant sized examples, the tuple growth strategy of fixed sized overallocations does poorly against the list strategy of increasingly large overallocations. In contrast, tuples can do better for certain small sizes. If you think there are critical use cases for tuple(it) where for a large, unsized it, then you can proposed a variable length strategy like that used for lists. If such use cases exist, I suspect they don't fit real well with Guido's intended philosophy on tuple use. Am closing this as a bug report. It is more a mixture of questions (explain this timing) and feature requests. Feel free to send me emails to explore this further. I've been through this all before and it is a substantial analysis project. If you go down this path, you may well find some room for improvements (str.join) in particular. Also, be sure to separate the issues. It is not all helpful to simultaneously cover itertools, *args, lists vs tuples, unsized iterators, str.join, list.extend, and a proposals to make terators report their length. ---------------------------------------------------------------------- Comment By: Nick Coghlan (ncoghlan) Date: 2004-12-15 07:17 Message: Logged In: YES user_id=1038590 Kicking in your direction Raymond, since this can badly affect the performance of itertools. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1085744&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com