Re: Index AM API cleanup

Peter Eisentraut Mon, 26 Aug 2024 05:21:56 -0700

On 21.08.24 21:25, Mark Dilger wrote:

The next twenty patches are a mix of fixes of various layering
violations, such as not allowing non-core index AMs from use in replica
identity full, or for speculative insertion, or for foreign key
constraints, or as part of merge join; with updates to the "treeb" code
as needed.  The changes to "treeb" are broken out so that they can also
easily be excluded from whatever gets committed.

I made a first pass through this patch set. I think the issues it aimsto address are mostly legitimate. In a few cases, we might need somemore discussion and perhaps will end up slicing the APIs a bitdifferently. The various patches that generalize the strategy numbersappear to overlap with things being discussed at [0], so we should seethat the solution covers all the use cases.

[0]:https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mdhcy4_qq0+noc...@mail.gmail.com

To make a dent, I picked out something that should be mostly harmless:Stop calling directly into _bt_getrootheight() (patch 0004). I thinkthis patch is ok, but I might call the API function amgettreeheightinstead of amgetrootheight. The former seems more general.

Also, a note for us all in this thread, changes to the index AM API needupdates to the corresponding documentation in doc/src/sgml/indexam.sgml.

I notice that _bt_getrootheight() is called only to fill in theIndexOptInfo tree_height field, which is only used by btcostestimate(),so in some sense this is btree-internal data. But making it so thatbtcostestimate() calls _bt_getrootheight() directly to avoid all thatintermediate business seems too complicated, and there was probably areason that the cost estimation functions don't open the index.

Interestingly, the cost estimation functions for gist and spgist alsolook at the tree_height field but nothing ever fills it on. So withyour API restructuring, someone could provide the missing API functionsfor those index types. Might be interesting.

That said, there might be value in generalizing this a bit. If you lookat the cost estimation functions in pgvector (hnswcostestimate() andivfflatcostestimate()), they both have this pattern thatbtcostestimate() tries to avoid: They open the index, look up somenumber, close the index, then make a cost estimate computation with thenumber looked up. So another idea would be to generalize thetree_height field to some "index size data" or even "internal data forcost estimation". This wouldn't need to change the API much, sincethese are all just integer values, but we'd label the functions andfields a bit differently.

Re: Index AM API cleanup

Reply via email to