Hi,
While working on the BRIN SK_SEARCHARRAY patch I noticed a silly bug in
handling clauses on multi-column BRIN indexes, introduced in PG13.
Consider a simple table with two columns (a,b) and a multi-columns BRIN
index on them:
create table t (a int, b int);
insert into t
select
mod(i,10000) + 100 * random(),
mod(i,10000) + 100 * random()
from generate_series(1,1000000) s(i);
create index on t using brin(a int4_minmax_ops, b int4_minmax_ops)
with (pages_per_range=1);
Let's run a query with condition on "a":
select * from t where a = 500;
QUERY PLAN
-----------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=97 loops=1)
Recheck Cond: (a = 500)
Rows Removed by Index Recheck: 53189
Heap Blocks: lossy=236
-> Bitmap Index Scan on t_a_b_idx (actual rows=2360 loops=1)
Index Cond: (a = 500)
Planning Time: 0.075 ms
Execution Time: 8.263 ms
(8 rows)
Now let's add another condition on b:
select * from t where a = 500 and b < 800;
QUERY PLAN
-----------------------------------------------------------------
Bitmap Heap Scan on t (actual rows=97 loops=1)
Recheck Cond: ((a = 500) AND (b < 800))
Rows Removed by Index Recheck: 101101
Heap Blocks: lossy=448
-> Bitmap Index Scan on t_a_b_idx (actual rows=4480 loops=1)
Index Cond: ((a = 500) AND (b < 800))
Planning Time: 0.085 ms
Execution Time: 14.989 ms
(8 rows)
Well, that's wrong. With one condition we accessed 236 pages, and with
additional condition - which should reduce the number of heap pages - we
accessed 448 pages.
The problem is in bringetbitmap(), which failed to combine the results
from consistent function correctly (and also does not abort early).
Here's a patch for that, I'll push it shortly after a bit more testing.
regard
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From bbe40c94e2293849d977da4720ef76d13160347a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Wed, 15 Feb 2023 17:32:53 +0100
Subject: [PATCH 01/10] Fix handling of multi-column BRIN indexes
When evaluating clauses on multiple keys of multi-column BRIN indexes,
the results should be combined using AND, and we can stop evaluating
once we find a mismatching scan key.
The existing code was simply scanning a range as long as the last batch
of scan keys returned true.
---
src/backend/access/brin/brin.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index de1427a1e0..85ae795949 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -660,7 +660,7 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
PointerGetDatum(bval),
PointerGetDatum(keys[attno - 1]),
Int32GetDatum(nkeys[attno - 1]));
- addrange = DatumGetBool(add);
+ addrange &= DatumGetBool(add);
}
else
{
@@ -681,11 +681,18 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
PointerGetDatum(bdesc),
PointerGetDatum(bval),
PointerGetDatum(keys[attno - 1][keyno]));
- addrange = DatumGetBool(add);
+ addrange &= DatumGetBool(add);
if (!addrange)
break;
}
}
+
+ /*
+ * We found a clause that eliminates this range. No point
+ * in evaluating more clauses.
+ */
+ if (!addrange)
+ break;
}
}
}
--
2.39.1