On 8/12/21 4:26 AM, Tomas Vondra wrote:
On 8/11/21 2:48 AM, Peter Geoghegan wrote:
On Wed, Jun 23, 2021 at 7:19 AM Andrey V. Lepikhov
<a.lepik...@postgrespro.ru> wrote:
Ivan Frolkov reported a problem with choosing a non-optimal index during
a query optimization. This problem appeared after building of an
extended statistics.

Any thoughts on this, Tomas?


Thanks for reminding me, I missed / forgot about this thread.

I agree the current behavior is unfortunate, but I'm not convinced the proposed patch is fixing the right place - doesn't this mean the index costing won't match the row estimates displayed by EXPLAIN?
I think, it is not a problem. In EXPLAIN you will see only 1 row with/without this patch.

I wonder if we should teach clauselist_selectivity about UNIQUE indexes, and improve the cardinality estimates directly, not just costing for index scans.
This idea looks better. I will try to implement it.

Also, is it correct that the patch calculates num_sa_scans only when (numIndexTuples >= 0.0)?
Thanks, fixed.

--
regards,
Andrey Lepikhov
Postgres Professional
>From 8a4ad08d61a5d14a45ef5e182f002e918f0eaccc Mon Sep 17 00:00:00 2001
From: Andrey Lepikhov <a.lepik...@postgrespro.ru>
Date: Wed, 23 Jun 2021 12:05:24 +0500
Subject: [PATCH] In the case of an unique one row btree index scan only one
 row can be returned. In the genericcostestimate() routine we must arrange the
 index selectivity value in accordance with this knowledge.

---
 src/backend/utils/adt/selfuncs.c | 78 ++++++++++++++++++++------------
 1 file changed, 48 insertions(+), 30 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 0c8c05f6c2..9538c4a5b4 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -6369,15 +6369,9 @@ genericcostestimate(PlannerInfo *root,
 	double		num_scans;
 	double		qual_op_cost;
 	double		qual_arg_cost;
-	List	   *selectivityQuals;
 	ListCell   *l;
 
-	/*
-	 * If the index is partial, AND the index predicate with the explicitly
-	 * given indexquals to produce a more accurate idea of the index
-	 * selectivity.
-	 */
-	selectivityQuals = add_predicate_to_index_quals(index, indexQuals);
+	numIndexTuples = costs->numIndexTuples;
 
 	/*
 	 * Check for ScalarArrayOpExpr index quals, and estimate the number of
@@ -6398,36 +6392,59 @@ genericcostestimate(PlannerInfo *root,
 		}
 	}
 
-	/* Estimate the fraction of main-table tuples that will be visited */
-	indexSelectivity = clauselist_selectivity(root, selectivityQuals,
-											  index->rel->relid,
-											  JOIN_INNER,
-											  NULL);
+	if (numIndexTuples >= 0.0)
+	{
+		List		*selectivityQuals;
 
-	/*
-	 * If caller didn't give us an estimate, estimate the number of index
-	 * tuples that will be visited.  We do it in this rather peculiar-looking
-	 * way in order to get the right answer for partial indexes.
-	 */
-	numIndexTuples = costs->numIndexTuples;
-	if (numIndexTuples <= 0.0)
+		/*
+		 * If the index is partial, AND the index predicate with the explicitly
+		 * given indexquals to produce a more accurate idea of the index
+		 * selectivity.
+		 */
+		selectivityQuals = add_predicate_to_index_quals(index, indexQuals);
+
+		/* Estimate the fraction of main-table tuples that will be visited */
+		indexSelectivity = clauselist_selectivity(root, selectivityQuals,
+												  index->rel->relid,
+												  JOIN_INNER,
+												  NULL);
+
+		/*
+		 * If caller didn't give us an estimate, estimate the number of index
+		 * tuples that will be visited.  We do it in this rather peculiar-looking
+		 * way in order to get the right answer for partial indexes.
+		 */
+		if (numIndexTuples == 0.0)
+		{
+			numIndexTuples = indexSelectivity * index->rel->tuples;
+
+			/*
+			 * The above calculation counts all the tuples visited across all
+			 * scans induced by ScalarArrayOpExpr nodes.  We want to consider the
+			 * average per-indexscan number, so adjust.  This is a handy place to
+			 * round to integer, too.  (If caller supplied tuple estimate, it's
+			 * responsible for handling these considerations.)
+			 */
+			numIndexTuples = rint(numIndexTuples / num_sa_scans);
+		}
+	}
+	else
 	{
-		numIndexTuples = indexSelectivity * index->rel->tuples;
+		Assert(numIndexTuples == -1.0);
 
 		/*
-		 * The above calculation counts all the tuples visited across all
-		 * scans induced by ScalarArrayOpExpr nodes.  We want to consider the
-		 * average per-indexscan number, so adjust.  This is a handy place to
-		 * round to integer, too.  (If caller supplied tuple estimate, it's
-		 * responsible for handling these considerations.)
+		 * Unique one row scan can select no more than one row. It needs to
+		 * manually set the selectivity of the index. The value of numIndexTuples
+		 * will be corrected later.
 		 */
-		numIndexTuples = rint(numIndexTuples / num_sa_scans);
+		indexSelectivity = 1.0 / index->rel->tuples;
 	}
 
 	/*
 	 * We can bound the number of tuples by the index size in any case. Also,
 	 * always estimate at least one tuple is touched, even when
 	 * indexSelectivity estimate is tiny.
+	 * Also, one row single scan case need to fix the numIndexTuples value here.
 	 */
 	if (numIndexTuples > index->tuples)
 		numIndexTuples = index->tuples;
@@ -6710,16 +6727,17 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 
 	/*
 	 * If index is unique and we found an '=' clause for each column, we can
-	 * just assume numIndexTuples = 1 and skip the expensive
-	 * clauselist_selectivity calculations.  However, a ScalarArrayOp or
-	 * NullTest invalidates that theory, even though it sets eqQualHere.
+	 * skip the expensive clauselist_selectivity calculations and assume
+	 * numIndexTuples = -1.0 in order to tell the generic cost estimator to skip
+	 * such expensive calculations too. However, a ScalarArrayOp or NullTest
+	 * invalidates that theory, even though it sets eqQualHere.
 	 */
 	if (index->unique &&
 		indexcol == index->nkeycolumns - 1 &&
 		eqQualHere &&
 		!found_saop &&
 		!found_is_null_op)
-		numIndexTuples = 1.0;
+		numIndexTuples = -1.0;
 	else
 	{
 		List	   *selectivityQuals;
-- 
2.25.1

Reply via email to