Tim Peters <t...@python.org> added the comment:
A new version of the file models a version of the `powersort` merge ordering too. It clearly dominates timsort and 2-merge in all cases tried, for this notion of "cost". Against it, its code is much more complex, and the algorithm is very far from obvious. The paper "motivates" it but doesn't really explain it in enough detail to make most of it clear (but refers to other papers where pieces of it are developed). Curious: unless a run is the last in an array, powersort never merges it immediately upon finding it. Instead its start and length are used to compute a "power", in turn used to decide whether to merge some previous number of runs. A dollar to anyone who can figure out what the heck the "power" computation is really doing in less than a full day without reading the paper ;-) Then two dollars if you can prove that the "never equal!" assert holds. It would require timing lots of C code on various cases to see whether the possible delay in merging new runs really matters. It's unclear to me a priori, because it buys something too (less memory copying overall). In any case, just switching to 2-merge would be easy, and that alone is enough to squash the contrived "bad cases" for the current approach. `powersort` would too, but may also actually help a bit in ordinary cases. Or not. ---------- Added file: https://bugs.python.org/file47785/runstack.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34561> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com