As promised, I've done another round of tests (script and spreadsheet attached) with
- v15 with 6974924347 and cc58eecc5d reverted - v15 with Thomas' patch - v15 with David's patch - v15 as is ("std") ...where v15 is at 7b735f8b52ad. This time I limited it to int, bigint, and text types. Since more cases now use random distributions, I also took some measures to tighten up the measurements: - Reuse the same random distribution for all tests where the input is randomized, by invoking the script with/without a second parameter - For the text case, use lpadded ints so that lexicographic order is the same as numeric order. I verified David's mod100 test case and added most test cases from the Orson Peters paper I mentioned above. I won't explain all of them here, but the low cardinality ones are randomized sets of: - mod8 - dupsq: x mod sqrt(n) , for 10 million about 3 thousand distinct values - dup8: (x**8 + n/2) mod n , for 10 million about 80 thousand distinct values, about 80% with 64 duplicates and 20% with 256 duplicates All the clear regressions I can see in v15 are in the above for one or more query types / data types, and both Thomas and David's patches restore performance for those. More broadly than the regression, Thomas' is very often the fastest of all, at the cost of more binary size. David's is occasionally slower than v15 or v15 with revert, but much of that is a slight difference and some is probably noise. -- John Naylor EDB: http://www.enterprisedb.com
qsort-fix-regression-jcn1.ods
Description: application/vnd.oasis.opendocument.spreadsheet
sort-bench-int-text-plus-low-card.sh
Description: application/shellscript