On 10/21/22 18:44, Tomas Vondra wrote:
>
> ...
>
>> Apart from that, this patch looks good.
>>
Sadly, I don't think we can fix it like this :-(
The problem is that all ranges start with all_nulls=true, because the
new range gets initialized by brin_memtuple_initialize() like that. But
this happens for *every* range before we even start processing the rows.
So this way all the ranges would end up with has_nulls=true, making that
flag pretty useless.
Actually, even just doing "truncate" on the table creates such all-nulls
range for the first range, and serializes it to disk.
I wondered why we even write such tuples for "empty" ranges to disk, for
example after "TRUNCATE" - the table is empty by definition, so how come
we write all-nulls brin summary for the first range?
For example brininsert() checks if the brin tuple was modified and needs
to be written back, but brinbuild() just ignores that, and initializes
(and writes) writes the tuple to disk anyway. I think we should not do
that - there should be a flag in BrinBuildState, tracking if the BRIN
tuple was modified, and we should only write it if it's true.
That means we should never get an on-disk summary representing nothing.
That doesn't fix the issue, though, because we still need to pass the
memtuple tuple to the add_value opclass procedure, and whether it sets
the has_nulls flag depends on whether it's a new tuple representing no
other rows (in which case has_nulls remains false) or whether it was
read from disk (in which case it needs to be flipped to 'true').
But the opclass has no way to tell the difference at the moment - it
just gets the BrinMemTuple. So we'd have to extend this, somehow.
I wonder how to do this in a back-patchable way - we can't add
parameters to the opclass procedure, and the other solution seems to be
storing it right in the BrinMemTuple, somehow. But that's likely an ABI
break :-(
The only solution I can think of is actually passing it using all_nulls
and has_nulls - we could set both flags to true (which we never do now)
and teach the opclass that it signifies "empty" (and thus not to update
has_nulls after resetting all_nulls).
Something like the attached (I haven't added any more tests, not sure
what would those look like - I can't think of a query testing this,
although maybe we could check how the flags change using pageinspect).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From a99fd6a737cec24bb4063e99a241ff3e04c6ebb8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.von...@postgresql.org>
Date: Thu, 20 Oct 2022 19:55:23 +0200
Subject: [PATCH 1/9] fixup: brin has_nulls
---
src/backend/access/brin/brin_bloom.c | 1 +
src/backend/access/brin/brin_inclusion.c | 1 +
src/backend/access/brin/brin_minmax.c | 1 +
src/backend/access/brin/brin_minmax_multi.c | 1 +
4 files changed, 4 insertions(+)
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 6b0af7267d5..60315450b41 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -539,6 +539,7 @@ brin_bloom_add_value(PG_FUNCTION_ARGS)
BloomGetFalsePositiveRate(opts));
column->bv_values[0] = PointerGetDatum(filter);
column->bv_allnulls = false;
+ column->bv_hasnulls = true;
updated = true;
}
else
diff --git a/src/backend/access/brin/brin_inclusion.c b/src/backend/access/brin/brin_inclusion.c
index 4b02d374f23..e0f44d3e62c 100644
--- a/src/backend/access/brin/brin_inclusion.c
+++ b/src/backend/access/brin/brin_inclusion.c
@@ -164,6 +164,7 @@ brin_inclusion_add_value(PG_FUNCTION_ARGS)
column->bv_values[INCLUSION_UNMERGEABLE] = BoolGetDatum(false);
column->bv_values[INCLUSION_CONTAINS_EMPTY] = BoolGetDatum(false);
column->bv_allnulls = false;
+ column->bv_hasnulls = true;
new = true;
}
diff --git a/src/backend/access/brin/brin_minmax.c b/src/backend/access/brin/brin_minmax.c
index 9e8a8e056cc..8a5661a8952 100644
--- a/src/backend/access/brin/brin_minmax.c
+++ b/src/backend/access/brin/brin_minmax.c
@@ -90,6 +90,7 @@ brin_minmax_add_value(PG_FUNCTION_ARGS)
column->bv_values[0] = datumCopy(newval, attr->attbyval, attr->attlen);
column->bv_values[1] = datumCopy(newval, attr->attbyval, attr->attlen);
column->bv_allnulls = false;
+ column->bv_hasnulls = true;
PG_RETURN_BOOL(true);
}
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 9a0bcf6698d..4e7119e2d78 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2500,6 +2500,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldctx);
column->bv_allnulls = false;
+ column->bv_hasnulls = true;
modified = true;
column->bv_mem_value = PointerGetDatum(ranges);
--
2.37.3
From 57e53d34f2f7bba91fcc0de6f4eff551669554fb Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.von...@postgresql.org>
Date: Sat, 22 Oct 2022 02:26:48 +0200
Subject: [PATCH 2/9] fixup: brin has_nulls 2
---
src/backend/access/brin/brin.c | 22 +++++++++++++--------
src/backend/access/brin/brin_bloom.c | 10 +++++++++-
src/backend/access/brin/brin_inclusion.c | 10 +++++++++-
src/backend/access/brin/brin_minmax.c | 10 +++++++++-
src/backend/access/brin/brin_minmax_multi.c | 10 +++++++++-
src/backend/access/brin/brin_tuple.c | 2 +-
6 files changed, 51 insertions(+), 13 deletions(-)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 20b7d65b948..6fabd14c263 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -56,6 +56,7 @@ typedef struct BrinBuildState
BrinRevmap *bs_rmAccess;
BrinDesc *bs_bdesc;
BrinMemTuple *bs_dtuple;
+ bool bs_modified;
} BrinBuildState;
/*
@@ -793,6 +794,7 @@ brinbuildCallback(Relation index,
/* set state to correspond to the next range */
state->bs_currRangeStart += state->bs_pagesPerRange;
+ state->bs_modified = false;
/* re-initialize state for it */
brin_memtuple_initialize(state->bs_dtuple, state->bs_bdesc);
@@ -801,6 +803,7 @@ brinbuildCallback(Relation index,
/* Accumulate the current tuple into the running state */
(void) add_values_to_range(index, state->bs_bdesc, state->bs_dtuple,
values, isnull);
+ state->bs_modified = true;
}
/*
@@ -1287,6 +1290,7 @@ initialize_brin_buildstate(Relation idxRel, BrinRevmap *revmap,
state->bs_rmAccess = revmap;
state->bs_bdesc = brin_build_desc(idxRel);
state->bs_dtuple = brin_new_memtuple(state->bs_bdesc);
+ state->bs_modified = false;
return state;
}
@@ -1569,14 +1573,16 @@ form_and_insert_tuple(BrinBuildState *state)
BrinTuple *tup;
Size size;
- tup = brin_form_tuple(state->bs_bdesc, state->bs_currRangeStart,
- state->bs_dtuple, &size);
- brin_doinsert(state->bs_irel, state->bs_pagesPerRange, state->bs_rmAccess,
- &state->bs_currentInsertBuf, state->bs_currRangeStart,
- tup, size);
- state->bs_numtuples++;
-
- pfree(tup);
+ if (state->bs_modified)
+ {
+ tup = brin_form_tuple(state->bs_bdesc, state->bs_currRangeStart,
+ state->bs_dtuple, &size);
+ brin_doinsert(state->bs_irel, state->bs_pagesPerRange, state->bs_rmAccess,
+ &state->bs_currentInsertBuf, state->bs_currRangeStart,
+ tup, size);
+ state->bs_numtuples++;
+ pfree(tup);
+ }
}
/*
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 60315450b41..96e5961a408 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -539,7 +539,15 @@ brin_bloom_add_value(PG_FUNCTION_ARGS)
BloomGetFalsePositiveRate(opts));
column->bv_values[0] = PointerGetDatum(filter);
column->bv_allnulls = false;
- column->bv_hasnulls = true;
+
+ /*
+ * When both bv_allnulls and bv_hasnulls are set to true, it means this
+ * summary was representing no rows. So we just set bv_hasnulls=false.
+ * Otherwise we need to set it to true, because there already were some
+ * NULL values, apparently.
+ */
+ column->bv_hasnulls = !column->bv_hasnulls;
+
updated = true;
}
else
diff --git a/src/backend/access/brin/brin_inclusion.c b/src/backend/access/brin/brin_inclusion.c
index e0f44d3e62c..dd8fe379c7b 100644
--- a/src/backend/access/brin/brin_inclusion.c
+++ b/src/backend/access/brin/brin_inclusion.c
@@ -164,7 +164,15 @@ brin_inclusion_add_value(PG_FUNCTION_ARGS)
column->bv_values[INCLUSION_UNMERGEABLE] = BoolGetDatum(false);
column->bv_values[INCLUSION_CONTAINS_EMPTY] = BoolGetDatum(false);
column->bv_allnulls = false;
- column->bv_hasnulls = true;
+
+ /*
+ * When both bv_allnulls and bv_hasnulls are set to true, it means this
+ * summary was representing no rows. So we just set bv_hasnulls=false.
+ * Otherwise we need to set it to true, because there already were some
+ * NULL values, apparently.
+ */
+ column->bv_hasnulls = !column->bv_hasnulls;
+
new = true;
}
diff --git a/src/backend/access/brin/brin_minmax.c b/src/backend/access/brin/brin_minmax.c
index 8a5661a8952..ead9e8f4e36 100644
--- a/src/backend/access/brin/brin_minmax.c
+++ b/src/backend/access/brin/brin_minmax.c
@@ -90,7 +90,15 @@ brin_minmax_add_value(PG_FUNCTION_ARGS)
column->bv_values[0] = datumCopy(newval, attr->attbyval, attr->attlen);
column->bv_values[1] = datumCopy(newval, attr->attbyval, attr->attlen);
column->bv_allnulls = false;
- column->bv_hasnulls = true;
+
+ /*
+ * When both bv_allnulls and bv_hasnulls are set to true, it means this
+ * summary was representing no rows. So we just set bv_hasnulls=false.
+ * Otherwise we need to set it to true, because there already were some
+ * NULL values, apparently.
+ */
+ column->bv_hasnulls = !column->bv_hasnulls;
+
PG_RETURN_BOOL(true);
}
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 4e7119e2d78..410bfdcfa79 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2500,7 +2500,15 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldctx);
column->bv_allnulls = false;
- column->bv_hasnulls = true;
+
+ /*
+ * When both bv_allnulls and bv_hasnulls are set to true, it means this
+ * summary was representing no rows. So we just set bv_hasnulls=false.
+ * Otherwise we need to set it to true, because there already were some
+ * NULL values, apparently.
+ */
+ column->bv_hasnulls = !column->bv_hasnulls;
+
modified = true;
column->bv_mem_value = PointerGetDatum(ranges);
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index c0e2dbd23ba..4d2a45ddcb6 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -517,7 +517,7 @@ brin_memtuple_initialize(BrinMemTuple *dtuple, BrinDesc *brdesc)
{
dtuple->bt_columns[i].bv_attno = i + 1;
dtuple->bt_columns[i].bv_allnulls = true;
- dtuple->bt_columns[i].bv_hasnulls = false;
+ dtuple->bt_columns[i].bv_hasnulls = true;
dtuple->bt_columns[i].bv_values = (Datum *) currdatum;
dtuple->bt_columns[i].bv_mem_value = PointerGetDatum(NULL);
--
2.37.3