Hi, over in the pgsql-general channel, Michael Lewis reported [1] a bit strange behavior switching between good/bad estimates with extended statistics.
The crux of the issue is that with statistics containing both MCV and functional dependencies, we prefer applying the MCV. And functional dependencies are used only for the remaining clauses on columns not covered by the MCV list. This works perfectly fine when the clauses match a MCV item (or even multiple of them). But if there's no matching MCV item, this may be problematic - statext_mcv_clauselist_selectivity tries to be smart, but when the MCV represents only a small fraction of the data set the results may not be far from just a product of selectivities (as if the clauses were independent). So I'm wondering about two things: 1) Does it actually make sense to define extended statistics with both MCV and functional dependencies? ISTM the MCV part will always filter all the clauses, before we even try to apply the dependencies. 2) Could we consider the functional dependencies when estimating the part not covered by the MCV list. Of course, this could only help with equality clauses (as supported by functional dependencies). regards [1] https://www.postgresql.org/message-id/CAMcsB%3Dy%3D3G_%2Bs_zFYPu2-O6OMWOvOkb3t1MU%3D907yk5RC_RaYw%40mail.gmail.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company