On Mon, 2 May 2022 at 09:20, Dan Stromberg <drsali...@gmail.com> wrote: > > > On Sun, May 1, 2022 at 1:44 PM Chris Angelico <ros...@gmail.com> wrote: >> >> On Mon, 2 May 2022 at 06:43, Dan Stromberg <drsali...@gmail.com> wrote: >> > On Sun, May 1, 2022 at 11:10 AM Chris Angelico <ros...@gmail.com> wrote: >> >> >> >> On Mon, 2 May 2022 at 01:53, Nas Bayedil <nbaye...@gmail.com> wrote: >> >> > We believe that using this method to develop completely new, fast >> >> > algorithms, approaching the speed of the famous *QuickSort*, the speed >> >> > of >> >> > which cannot be surpassed, but its drawback can be circumvented, in the >> >> > sense of stack overflow, on some data. >> >> >> >> Hmm, actually TimSort *does* exceed the speed of quicksort for a lot >> >> of real-world data. For instance, if you take a large sorted list, >> >> append a handful of (unsorted) items to it, and then sort the list, >> >> TimSort can take advantage of the fact that the bulk of the list is >> >> sorted. It ends up significantly faster than re-sorting the entire >> >> list. >> > >> > >> > In fact, Timsort is O(n) for already-sorted data, while many quicksorts >> > are O(n^2) for already-sorted data. >> > >> > Quicksort can be salvaged by using a median-of-3 partitioning, but it's >> > still O(n^2) in the (less common) worst case. >> > >> >> This is true, but purely sorted data isn't a very practical case. The >> case of mostly-sorted data IS practical, though, so it's a quite big >> win that it can be close to O(n), and still faster than inserting each >> item individually. > > > You seem to be of the impression that nearly-sorted data isn't an uphill > battle with a straightforward quicksort. > > I'm having a hard time convincing myself of that. >
The median-of-three partitioning technique makes that work reasonably well, so it won't be pathologically slow. It's hardly Quicksort's best feature, but it could easily be a lot worse. I'd have to check, but I think it still manages to be O(n log n). Merge sort, of course, is a lot more consistent, but the asymptotic cost is still broadly the same. But Timsort manages to be close to O(n) for sorted data, reversed data, nearly-sorted or nearly-reversed data, etc. Makes it very handy for jobs like "click a heading to sort by it", where you might add multiple sort keys. (Plus, Python's implementation has some cool tricks for small collections that make it quite efficient.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list