Tim Peters <t...@python.org> added the comment:

A new version of the file models a version of the `powersort` merge ordering 
too.  It clearly dominates timsort and 2-merge in all cases tried, for this 
notion of "cost".

Against it, its code is much more complex, and the algorithm is very far from 
obvious.  The paper "motivates" it but doesn't really explain it in enough 
detail to make most of it clear (but refers to other papers where pieces of it 
are developed).

Curious:  unless a run is the last in an array, powersort never merges it 
immediately upon finding it.  Instead its start and length are used to compute 
a "power", in turn used to decide whether to merge some previous number of 
runs.  A dollar to anyone who can figure out what the heck the "power" 
computation is really doing in less than a full day without reading the paper 
;-)  Then two dollars if you can prove that the "never equal!" assert holds.

It would require timing lots of C code on various cases to see whether the 
possible delay in merging new runs really matters.  It's unclear to me a 
priori, because it buys something too (less memory copying overall).

In any case, just switching to 2-merge would be easy, and that alone is enough 
to squash the contrived "bad cases" for the current approach.  `powersort` 
would too, but may also actually help a bit in ordinary cases.  Or not.

----------
Added file: https://bugs.python.org/file47785/runstack.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34561>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to