Per the discussion at [1], genericcostestimate() produces estimates
that are noticeably off for small indexes, because it fails to
discount the index metapage while computing numIndexPages.
Here's a first-draft attempt at improving that.

The basic issue is that the calculation of numIndexPages is (as the
comment says) meant to consider only leaf index pages, but we were
simply using the total index size (index->pages) in the formula.
Subtracting the metapage produces visibly saner results when the
index is only a couple pages in total.

I thought for a bit about trying to also discount index upper pages,
but decided it's not worth it, at least for now.  Given reasonable
index fanout, upper pages should amount to at most a percent or two
of the index, so accounting for them would only move the estimates by
a percent or two.  Moreover, it's hard to make a non-squishy estimate
of how many upper pages there are.  But we do know whether there's a
metapage or not, and failing to account for it produces 100% relative
error if the index has only one data-bearing page.  So that seems
worth dealing with.

Some notes:

* Adding a field to GenericCosts breaks ABI for external callers
of genericcostestimate(), but not API, if they followed the
recommendation to zero the whole struct.  If numNonLeafPages is
left at zero then the results don't change.  We wouldn't consider
back-patching a change like this anyway, so the ABI break is not
a problem.

* There are other uses of index->pages in selfuncs.c.  I looked
through them and didn't feel motivated to change any, but perhaps
someone else will have a different opinion.

* Unsurprisingly, this change causes several visible changes in the
core regression tests for index selection with small indexes.  In each
of them it seemed that the point of the test case was to test the plan
as-given.  So I hacked things up to keep the plans the same, either by
disabling an alternative plan choice or by increasing the size of the
table.

This is v19 material, so I'll park it in the next CF.

                        regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/CACJPJu8oY9hb7LSsqHxbn24Gpa_tWBkcwPei%3DfottvgBeSc%2BPQ%40mail.gmail.com

#text/x-diff; name="v1-discount-metapage-in-genericcostestimate.patch" 
[v1-discount-metapage-in-genericcostestimate.patch] 
/home/tgl/pgsql/v1-discount-metapage-in-genericcostestimate.patch


Reply via email to