On Fri, Aug 17, 2018 at 7:48 AM, Peter Geoghegan <p...@bowt.ie> wrote: > On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: >> * groups and certain aggregates (MIN() and MAX() of suffix index >> columns within each group) >> * index scans where the scan key doesn't include the leading columns >> (but you expect there to be sufficiently few values) >> * merge joins (possibly the trickiest and maybe out of range) > > FWIW, I suspect that we're going to have the biggest problems in the > optimizer. It's not as if ndistinct is in any way reliable. That may > matter more on average than it has with other path types.
Can you give an example of problematic ndistinct underestimation? I suppose you might be able to defend against that in the executor: if you find that you've done an unexpectedly high number of skips, you could fall back to regular next-tuple mode. Unfortunately that's require the parent plan node to tolerate non-unique results. I noticed that the current patch doesn't care about restrictions on the range (SELECT DISTINCT a FROM t WHERE a BETWEEN 500 and 600), but that causes it to overestimate the number of btree searches, which is a less serious problem (it might not chose a skip scan when it would have been better). -- Thomas Munro http://www.enterprisedb.com