On 26.08.24 17:10, Mark Dilger wrote:
To make a dent, I picked out something that should be mostly harmless: Stop
calling directly into _bt_getrootheight() (patch 0004). I think this patch is
ok, but I might call the API function amgettreeheight instead of
amgetrootheight. The former seems more general.
Peter, your proposed rename seems fine for the current implementation, but your
suggestion below might indicate a different naming.
I notice that _bt_getrootheight() is called only to fill in the IndexOptInfo
tree_height field, which is only used by btcostestimate(), so in some sense
this is btree-internal data. But making it so that btcostestimate() calls
_bt_getrootheight() directly to avoid all that intermediate business seems too
complicated, and there was probably a reason that the cost estimation functions
don't open the index.
Interestingly, the cost estimation functions for gist and spgist also look at
the tree_height field but nothing ever fills it on. So with your API
restructuring, someone could provide the missing API functions for those index
types. Might be interesting.
That said, there might be value in generalizing this a bit. If you look at the cost estimation
functions in pgvector (hnswcostestimate() and ivfflatcostestimate()), they both have this pattern
that btcostestimate() tries to avoid: They open the index, look up some number, close the index,
then make a cost estimate computation with the number looked up. So another idea would be to
generalize the tree_height field to some "index size data" or even "internal data
for cost estimation". This wouldn't need to change the API much, since these are all just
integer values, but we'd label the functions and fields a bit differently.
Would they be just integers? They could also be void*, with
amgetrootheight_function returning data allocated in the current memory
context. For btree, that would just be a four byte integer, but other indexes
could return whatever they like. If you like that idea, I can code that up for
v18, naming the field something like amgetcostestimateinfo_function.
Here is a cleaned-up version of the v17-0004 patch. I have applied the
renaming discussed above. I have also made a wrapper function
btgettreeheight() that calls _bt_getrootheight(). That way, if we ever
want to change the API, we don't have to change _bt_getrootheight(), or
disentangle it then. I've also added documentation and put in a source
code comment that the API could be expanded for additional uses. Also,
I have removed the addition to the IndexOptInfo struct; that didn't seem
necessary.From 739bb4f6e84b937a35b1c76d7c90df5c90ac3cbd Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Tue, 3 Sep 2024 17:17:33 +0200
Subject: [PATCH v17.1] Add amgettreeheight index AM API routine
The only current implementation is for btree where it calls
_bt_getrootheight(). Other index types can now also use this to pass
information to their amcostestimate routine. Previously, btree was
hardcoded and other index types could not hook into the optimizer at
this point.
Author: Mark Dilger <mark.dil...@enterprisedb.com>
Discussion:
https://www.postgresql.org/message-id/flat/e72eaa49-354d-4c2e-8eb9-255197f55...@enterprisedb.com
---
contrib/bloom/blutils.c | 1 +
doc/src/sgml/indexam.sgml | 16 ++++++++++++++++
src/backend/access/brin/brin.c | 1 +
src/backend/access/gin/ginutil.c | 1 +
src/backend/access/gist/gist.c | 1 +
src/backend/access/hash/hash.c | 1 +
src/backend/access/nbtree/nbtree.c | 10 ++++++++++
src/backend/access/spgist/spgutils.c | 1 +
src/backend/optimizer/util/plancat.c | 11 +++++------
src/include/access/amapi.h | 8 ++++++++
src/include/access/nbtree.h | 1 +
src/test/modules/dummy_index_am/dummy_index_am.c | 1 +
12 files changed, 47 insertions(+), 6 deletions(-)
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 6836129c90d..a29330afcd3 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -137,6 +137,7 @@ blhandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = blvacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = blcostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = bloptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml
index e3c1539a1e3..dc7d14b60dd 100644
--- a/doc/src/sgml/indexam.sgml
+++ b/doc/src/sgml/indexam.sgml
@@ -146,6 +146,7 @@ <title>Basic API Structure for Indexes</title>
amvacuumcleanup_function amvacuumcleanup;
amcanreturn_function amcanreturn; /* can be NULL */
amcostestimate_function amcostestimate;
+ amgettreeheight_function amgettreeheight; /* can be NULL */
amoptions_function amoptions;
amproperty_function amproperty; /* can be NULL */
ambuildphasename_function ambuildphasename; /* can be NULL */
@@ -480,6 +481,21 @@ <title>Index Access Method Functions</title>
<para>
<programlisting>
+int
+amgettreeheight (Relation rel);
+</programlisting>
+ Compute the height of a tree-shaped index. This information is supplied to
+ the <function>amcostestimate</function> function in
+ <literal>path->indexinfo->tree_height</literal> and can be used to support
+ the cost estimation. The result is not used anywhere else, so this
+ function can actually be used to compute any kind of data (that fits into
+ an integer) about the index that the cost estimation function might want to
+ know. If the computation is expensive, it could be useful to cache the
+ result as part of <literal>RelationData.rd_amcache</literal>.
+ </para>
+
+ <para>
+<programlisting>
bytea *
amoptions (ArrayType *reloptions,
bool validate);
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 6467bed604a..94a8bd07017 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -279,6 +279,7 @@ brinhandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = brinvacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = brincostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = brinoptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 5747ae6a4ca..830d67fbc20 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -69,6 +69,7 @@ ginhandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = ginvacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = gincostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = ginoptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ed4ffa63a77..2d7a0687d4a 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -91,6 +91,7 @@ gisthandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = gistvacuumcleanup;
amroutine->amcanreturn = gistcanreturn;
amroutine->amcostestimate = gistcostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = gistoptions;
amroutine->amproperty = gistproperty;
amroutine->ambuildphasename = NULL;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 01d06b7c328..a783b9b4e25 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -89,6 +89,7 @@ hashhandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = hashvacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = hashcostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = hashoptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
diff --git a/src/backend/access/nbtree/nbtree.c
b/src/backend/access/nbtree/nbtree.c
index 686a3206f72..8cfaab949be 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -133,6 +133,7 @@ bthandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = btvacuumcleanup;
amroutine->amcanreturn = btcanreturn;
amroutine->amcostestimate = btcostestimate;
+ amroutine->amgettreeheight = btgettreeheight;
amroutine->amoptions = btoptions;
amroutine->amproperty = btproperty;
amroutine->ambuildphasename = btbuildphasename;
@@ -1445,3 +1446,12 @@ btcanreturn(Relation index, int attno)
{
return true;
}
+
+/*
+ * btgettreeheight() -- Compute tree height for use by btcostestimate().
+ */
+int
+btgettreeheight(Relation rel)
+{
+ return _bt_getrootheight(rel);
+}
diff --git a/src/backend/access/spgist/spgutils.c
b/src/backend/access/spgist/spgutils.c
index 76b80146ff0..72b7661971f 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -76,6 +76,7 @@ spghandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = spgvacuumcleanup;
amroutine->amcanreturn = spgcanreturn;
amroutine->amcostestimate = spgcostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = spgoptions;
amroutine->amproperty = spgproperty;
amroutine->ambuildphasename = NULL;
diff --git a/src/backend/optimizer/util/plancat.c
b/src/backend/optimizer/util/plancat.c
index 78a3cfafde4..3f3a5dfae6f 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -485,13 +485,12 @@ get_relation_info(PlannerInfo *root, Oid
relationObjectId, bool inhparent,
info->tuples = rel->tuples;
}
- if (info->relam == BTREE_AM_OID)
+ /*
+ * Get tree height while we have the index open
+ */
+ if (amroutine->amgettreeheight)
{
- /*
- * For btrees, get tree height while we
have the index
- * open
- */
- info->tree_height =
_bt_getrootheight(indexRelation);
+ info->tree_height =
amroutine->amgettreeheight(indexRelation);
}
else
{
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index f25c9d58a7d..c51de742ea0 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -140,6 +140,13 @@ typedef void (*amcostestimate_function) (struct
PlannerInfo *root,
double *indexCorrelation,
double *indexPages);
+/* estimate height of a tree-structured index
+ *
+ * XXX This just computes a value that is later used by amcostestimate. This
+ * API could be expanded to support passing more values if the need arises.
+ */
+typedef int (*amgettreeheight_function) (Relation rel);
+
/* parse index reloptions */
typedef bytea *(*amoptions_function) (Datum reloptions,
bool
validate);
@@ -272,6 +279,7 @@ typedef struct IndexAmRoutine
amvacuumcleanup_function amvacuumcleanup;
amcanreturn_function amcanreturn; /* can be NULL */
amcostestimate_function amcostestimate;
+ amgettreeheight_function amgettreeheight; /* can be NULL */
amoptions_function amoptions;
amproperty_function amproperty; /* can be NULL */
ambuildphasename_function ambuildphasename; /* can be NULL */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 9af9b3ecdcc..d64300fb973 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1186,6 +1186,7 @@ extern IndexBulkDeleteResult
*btbulkdelete(IndexVacuumInfo *info,
extern IndexBulkDeleteResult *btvacuumcleanup(IndexVacuumInfo *info,
IndexBulkDeleteResult *stats);
extern bool btcanreturn(Relation index, int attno);
+extern int btgettreeheight(Relation rel);
/*
* prototypes for internal functions in nbtree.c
diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c
b/src/test/modules/dummy_index_am/dummy_index_am.c
index 0b477116067..2841cf2eb4b 100644
--- a/src/test/modules/dummy_index_am/dummy_index_am.c
+++ b/src/test/modules/dummy_index_am/dummy_index_am.c
@@ -308,6 +308,7 @@ dihandler(PG_FUNCTION_ARGS)
amroutine->amvacuumcleanup = divacuumcleanup;
amroutine->amcanreturn = NULL;
amroutine->amcostestimate = dicostestimate;
+ amroutine->amgettreeheight = NULL;
amroutine->amoptions = dioptions;
amroutine->amproperty = NULL;
amroutine->ambuildphasename = NULL;
--
2.46.0