[Joachim Durchholz] >>> Wikipedia says it's going from 2NlogN to N. If a sort is massively >>> dominated by the comparison, that could give a speedup of up to 100% >>> (approximately - dropping the logN factor is almost irrelevant, what >>> counts is losing that factor of 2).
[Gabriel Genellina] >> In fact it's the other way - losing a factor of 2 is irrelevant, >> O(2N)=O(N). The logN factor is crucial here. [Joachim Durchholz] > That's just a question of what you're interested in. > > If it's asymptotic behavior, then the O(logN) factor is a difference. > > If it's practical speed, a constant factor of 2 is far more relevant > than any O(logN) factor. Nope. Even if you're thinking of base 10 logarithms, log(N, 10) > 2 for every N > 100. Base 2 logarithms are actually most appropriate here, and log(N, 2) > 2 for every N > 4. So even if the "2" made sense here (it doesn't -- see next paragraph), the log(N) term dominates it for even relatively tiny values of N. Now it so happens that current versions of Python (and Perl) use merge sorts, where the worst-case number of comparisons is roughly N*log(N, 2), not Wikipedia's 2*N*log(N, 2) (BTW, the Wikipedia article neglected to specify the base of its logarithms, but it's clearly intended to be 2). So that factor of 2 doesn't even exist in current reality: the factor of log(N, 2) is the /whole/ worst-case story here, and, e.g., is near 10 when N is near 1000. A factor of 10 is nothing to sneeze at. OTOH, current versions of Python (and Perl) also happen to use "adaptive" merge sorts, which can do as few as N-1 comparisons in the best case (the input is already sorted, or reverse-sorted). Now I'm not sure why anyone would sort a list already known to be sorted, but if you do, DSU is a waste of time. In any other case, it probably wins, and usually wins big, and solely by saving a factor of (up to) log(N, 2) key computations. > (I'm not on the list, so I won't see responses unless specifically CC'd.) Done. -- http://mail.python.org/mailman/listinfo/python-list