I want to revive a patch I sent couple years ago to the performance list, as I have seen the issue pop up repeatedly since then.
Original thread is here: https://www.postgresql.org/message-id/CAK_s-G2tc8r0N3AqPr8fW5QsRQMbZNurgAQ%3D_ME1aaf4vOmnnA%40mail.gmail.com The problem is that if you have a highly skewed distribution, such as exponential frequencies, it usually stores too few entries in the MCV list. For example: create table foo as select floor(-log(random())/log(2))::int as who from generate_series(1,10000000); This will usually store 0 to 5 as MCV with the default stat target[1]. Then value 6 is not stored, and its ~75k repetitions get averaged over the remaining distinct values. This leads to horrible misplanning for rare values. For example, who=17 has 37 entries, but is estimated to have 20,000. The correct join for 37 values is often not the correct one for 20,000. If we stored just a few more values, their inclusion in the MCV would mean they are depleted from the residual count, correctly lowering the estimate we would get for very rare values not included in the sample. So instead of having the threshold of 1.25x the average frequency over all values, I think we should use 1.25x the average frequency of only those values not already included in the MCV, as in the attached. As it is, you can partially overcome the too short MCV list by cranking up the default statistics target, but this is a weak effect and comes at a high cost of CPU time. In some of the real distributions I've looked at, cranking up default statistics target is almost entirely ineffective. I think that perhaps maxmincount should also use the dynamic values_cnt_remaining rather than the static one. After all, things included in the MCV don' get represented in the histogram. When I've seen planning problems due to skewed distributions I also usually see redundant values in the histogram boundary list which I think should be in the MCV list instead. But I have not changed that here, pending discussion. [1] Occasionally it will store a much longer MCV list, because no values was sampled exactly once, which triggers a different code path in which all seen values are put in the MCV and the histogram is NULL. This is not reliable, as whether the least-sample value is present in the sample once or twice is pretty brittle. Cheers, Jeff
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index f952b3c..b02ac20 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -2575,7 +2575,8 @@ compute_scalar_stats(VacAttrStatsP stats, * the MCV list is not complete, it's generally worth being more * selective, and not just filling it all the way up to the stats * target. So for an incomplete list, we try to take only MCVs that - * are significantly more common than average. + * are significantly more common than average frequency for values which + * will not be represented in the MCV list. */ if (track_cnt == ndistinct && toowide_cnt == 0 && stats->stadistinct > 0 && @@ -2587,6 +2588,7 @@ compute_scalar_stats(VacAttrStatsP stats, else { double ndistinct_table = stats->stadistinct; + double values_cnt_remaining = (double) values_cnt; double avgcount, mincount, maxmincount; @@ -2594,25 +2596,27 @@ compute_scalar_stats(VacAttrStatsP stats, /* Re-extract estimate of # distinct nonnull values in table */ if (ndistinct_table < 0) ndistinct_table = -ndistinct_table * totalrows; - /* estimate # occurrences in sample of a typical nonnull value */ - avgcount = (double) nonnull_cnt / ndistinct_table; - /* set minimum threshold count to store a value */ - mincount = avgcount * 1.25; - if (mincount < 2) - mincount = 2; - /* don't let threshold exceed 1/K, however */ - maxmincount = (double) values_cnt / (double) num_bins; - if (mincount > maxmincount) - mincount = maxmincount; - if (num_mcv > track_cnt) - num_mcv = track_cnt; - for (i = 0; i < num_mcv; i++) - { + /* estimate # of occurrences in sample of a typical nonnull value */ + for (i = 0; i < num_mcv; i++) + { + avgcount = values_cnt_remaining / ndistinct_table; + /* set minimum threshold count to store a value */ + mincount = avgcount * 1.25; + if (mincount < 2) + mincount = 2; + /* don't let threshold exceed 1/K, however */ + maxmincount = (double) values_cnt / (double) (num_bins); + if (mincount > maxmincount) + mincount = maxmincount; + if (num_mcv > track_cnt) + num_mcv = track_cnt; if (track[i].count < mincount) { num_mcv = i; break; } + values_cnt_remaining -= track[i].count; + ndistinct_table--; } }