On Mon, 27 Oct 2008 17:03:58 -0700, Glenn Linderman wrote: >>> A little harder question is how to create a key that corresponds to >>> ascending string followed by descending string? >>> >>> To do that you can sort the data two times, relying on the stable >>> nature of the Python sort. >>> >>> > Ick. Costs twice as much.
But that's only a constant cost. For small amounts of data, it's trivial, and for large amounts, it's dominated by the cost of sorting N items, which is O(N log N) by memory. (Python's sort is a modified heap sort, I believe.) [...] > This solution seems to be the solution I was looking for. I think "Neg" > is a poor name, but all the good names are significantly longer... Given > its usage, though, in a sorted call, "Rev" would be a better poor name! > ReversedCompare might be a better long name. > > How can I suggest adding it to the documentation for Python 3.0 ? It is > much preferable to the "obvious" solution of sorting twice, and doing > that is a trap which many people might fall into. I posted here > instead, and thank you for the solution. Why do you call it a trap? You're preferred solution is significantly slower for small amounts of data than the solution you call a trap: class Neg(object): def __init__(self, value): self.value = value def __cmp__(self, other): return -cmp(self.value, other.value) def sort_updown1(pairs): return sorted(pairs, key=lambda (s, t): (s, Neg(t))) import operator def sort_updown2(pairs): pairs = pairs[:] pairs.sort(key=operator.itemgetter(1), reverse=True) pairs.sort(key=operator.itemgetter(0)) return pairs pairs = [('al', 'boo'), ('zul', 'ao'), ('al', 'den'), ('zul', 'o')] from timeit import Timer t1 = Timer("sort_updown1(pairs)", "from __main__ import sort_updown1, pairs") t2 = Timer("sort_updown2(pairs)", "from __main__ import sort_updown2, pairs") And the results: >>> t1.repeat(number=1000) [0.053807973861694336, 0.053076028823852539, 0.052966117858886719] >>> t2.repeat(number=1000) [0.022480964660644531, 0.022369861602783203, 0.02264404296875] The difference is even more significant for larger amounts of data: >>> pairs = pairs*100 >>> t1.repeat(number=1000) [5.3583409786224365, 4.6985390186309814, 4.6953370571136475] >>> t2.repeat(number=1000) [1.1064291000366211, 1.0017910003662109, 0.48421788215637207] And more significant still as we sort even more data: >>> pairs = pairs*10 >>> t1.repeat(number=1000) [53.255411863327026, 53.792617082595825, 53.549386024475098] >>> t2.repeat(number=1000) [5.8788919448852539, 5.0701780319213867, 4.8528299331665039] Or, tabulating the results: N = 4 400 4000 ------------------------ t1 = 0.05 4.6 53.3 t2 = 0.02 0.5 4.9 As you can see, the supposed "trap" of sorting twice is significantly faster, by an order of magnitude, than sorting it once. It would be even faster if I didn't bother making a copy of the list before sorting. -- Steven -- http://mail.python.org/mailman/listinfo/python-list