On Wed, 30 Mar 2011 11:06:20 +0200, Antoon Pardon wrote: > As far as I can see, key will only produce significant speedups, if > comparing items can then be completly done internally in the python > engine without referencing user python code.
Incorrect. You don't even need megabytes of data to see significant differences. How about a mere 1000 short strings? [steve@wow-wow ~]$ python2.6 Python 2.6.6 (r266:84292, Dec 21 2010, 18:12:50) [GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from random import shuffle >>> data = ['a'*n for n in range(1000)] >>> shuffle(data) >>> from timeit import Timer >>> >>> t_key = Timer('sorted(data, key=lambda a: len(a))', ... 'from __main__ import data') >>> t_cmp = Timer('sorted(data, cmp=lambda a,b: cmp(len(a), len(b)))', ... 'from __main__ import data') >>> >>> min(t_key.repeat(number=1000, repeat=5)) 0.89357517051696777 >>> min(t_cmp.repeat(number=1000, repeat=5)) 7.6032949066162109 That's almost ten times slower. Of course, the right way to do that specific sort is: >>> t_len = Timer('sorted(data, key=len)', 'from __main__ import data') >>> min(t_len.repeat(number=1000, repeat=5)) 0.64559602737426758 which is even easier and faster. But even comparing a pure Python key function to the cmp function, it's obvious that cmp is nearly always slower. Frankly, trying to argue that cmp is faster, or nearly as fast, is a losing proposition. In my opinion, the only strategy that has even a faint glimmer of hope is to find a convincing use-case where speed does not matter. Or, an alternative approach would be for one of the cmp-supporters to take the code for Python's sort routine, and implement your own sort-with- cmp (in C, of course, a pure Python solution will likely be unusable) and offer it as a download. For anyone who knows how to do C extensions, this shouldn't be hard: just grab the code in Python 2.7 and make it a stand- alone function that can be imported. If you get lots of community interest in this, that is a good sign that the solution is useful and practical, and then you can push to have it included in the standard library or even as a built-in. And if not, well, at least you will be able to continue using cmp in your own code. -- Steven -- http://mail.python.org/mailman/listinfo/python-list