I wrote: > That said, I think what I'll do next is test the v3-0001 standalone > patch with tuplesort specializations for more data types. I already > have a decent test script that I can build on for this.
I've run a test with 10 million records using all types found in the v3 patch "accelerate tuple sorting for common types", using a variety of initial orderings, covering index build (btree only, no gist) and queries (both single value and whole record). Attached is the test script and a spreadsheet with the raw data as well as comparisons of the min runtimes in seconds from 5 runs. This is using gcc 11.1 on fairly recent Intel hardware. Overall, this shows a good improvement for these types. One exception is the "0/1" ordering, which is two values in random order. I'm guessing it's because of the cardinality detector, but some runs have apparent possible regressions. It's a bit high and sporadic to just blow off as noise, but this case might not be telling us anything useful. Other notes: - The inet type seems unnaturally fast in some places, meaning faster than int or date. That's suspicous, but I haven't yet dug deeper into that. - With the patch, the VM binary size increases by ~9kB. I have some hunches on the "future research" comments: XXX Can we avoid repeating the null-handling logic? More templating? ;-) XXX Is it worth specializing for reverse sort? I'll run a limited test on DESC to see if anything stands out, but I wonder if the use case is not common -- I seem to remember seeing DESC less often on the first sort key column. XXX Is it worth specializing for nulls first, nulls last, not null? Editorializing the null position in queries is not very common in my experience. Not null is interesting since it'd be trivial to pass constant false to the same Apply[XYZ]SortComparator() and let the compiler remove all those branches for us. On the other hand, those branches would be otherwise predicted well, so it might make little or no difference. XXX Should we have separate cases for "result is authoritative", "need XXX tiebreaker for atts 1..n (= abbrev case)", "need tie breaker for XXX atts 2..n"? The first one seems to be the only case where the SortTuple could be smaller, since the tuple pointer is null. That sounds like a good avenue to explore. Less memory usage is always good. Not sure what you mean by the third case -- there are 2+ sort keys, but the first is authoritative from the datum, so the full comparison can skip the first key? -- John Naylor EDB: http://www.enterprisedb.com
qsort-specialize-types-jcn1.ods
Description: application/vnd.oasis.opendocument.spreadsheet
test-tuplesort-20220113.ods
Description: application/vnd.oasis.opendocument.spreadsheet