On 4/19/22 23:08, Tom Lane wrote: > I wrote: >> it looks like the problem is that the extended stats haven't been used >> while forming the estimate of the number of index entries retrieved, so >> we overestimate the cost of using this index. >> That seems like a bug. Tomas? > > I dug into this enough to locate the source of the problem. > btcostestimate includes the partial index clauses in what it > sends to clauselist_selectivity, but per the comments for > add_predicate_to_index_quals: > > * Note that indexQuals contains RestrictInfo nodes while the indpred > * does not, so the output list will be mixed. This is OK for both > * predicate_implied_by() and clauselist_selectivity(), but might be > * problematic if the result were passed to other things. > > That comment was true when it was written, but it's been falsified > by the extended-stats patches, which have added a whole lot of logic > in and under clauselist_selectivity that ignores clauses that are not > RestrictInfos. > > While we could perhaps fix this by having add_predicate_to_index_quals > add RestrictInfos, I'm inclined to feel that the extended-stats code > is in the wrong. The contract for clauselist_selectivity has always > been that it could optimize if given RestrictInfos rather than bare > clauses, not that it would fail to work entirely without them. > There are probably more places besides add_predicate_to_index_quals > that are relying on that. >
Yes, that seems like a fair assessment. I'll look into fixing this, not sure how invasive it will get, though. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company