Hi, The attached patch version modifies how the non-MCV selectivity is computed, along the lines explained in the previous message.
The comments in statext_clauselist_selectivity() explain it in far more detail, but we this: 1) Compute selectivity using the MCV (s1). 2) To compute the non-MCV selectivity (s2) we do this: 2a) See how many top-level equalities are there (and compute ndistinct estimate for those attributes). 2b) If there is an equality on each column, we know there can only be a single matching item. If we found it in the MCV (i.e. s1 > 0) we're done, and 's1' is the answer. 2c) If only some columns have equalities, we estimate the selectivity for equalities as s2 = ((1 - mcv_total_sel) / ndistinct) If there are no remaining conditions, we're done. 2d) To estimate the non-equality clauses (on non-MCV part only), we either repeat the whole process by calling clauselist_selectivity() or approximating s1 to the non-MCV part. This needs a bit of care to prevent infinite loops. Of course, with 0002 this changes slightly, because we may try using a histogram to estimate the non-MCV part. But that's just an extra step right before (2a). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
0001-multivariate-MCV-lists-20180401.patch.gz
Description: application/gzip
0002-multivariate-histograms-20180401.patch.gz
Description: application/gzip