On Sat, Feb 21, 2015 at 12:15 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: >> Although I cannot easily explain the disparity in performance between >> 1M and 5M sized sets for this query: >> >> select count(distinct randtxt) from stuff_text >> >> You did make sure that the queries didn't spill to disk, right? Or >> that they did so consistently, at least. > > All the queries were running with work_mem=1GB, and I don't think they > were spilling to disk. Actually, I don't have the 'merge' patch applied, > so that would probably crash because of SIGSEGV.
Actually, that isn't so -- Andrew's datum sort patch incidentally fixes tuplesort_begin_datum() in the same way as the patch I posted. You shouldn't actually need my patch, which was just an immediate fix. I can recreate the problem you see with text sort regressions. Abbreviation is aborted for the case in question, unsurprisingly, and fairly far in. With that many tuples, the idea of taking abbreviated cardinality as a proxy for full cardinality becomes less important, because either way you have to do at least 10 comparisons per item on average. Originally, I had as a heuristic that once you get to a million items without aborting, stop considering the possibility of aborting - that much should probably be added back, at a minimum. Andrew is already doing that in his numeric patch (at 100,000 tuples), and that may be the only reason why numeric is not regressed too. For a query like this, if I "#define DEBUG_ABBREV_KEYS", so that we don't actually abort, and run this query of yours until it stabilizes: select * from (select * from stuff_text order by randtxt offset 100000000000) foo; It takes about 5.3 seconds. If I don't, however - if it is allowed to run its course - it takes a whopping 35 seconds. So the ad-hoc cost model of the text opclass definitely needs more work, as I suspected. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers