Thanks for doing all this.  (Do I understand correctly that this patch
is not in the commitfest?)

I think my mental model for this was that "allnulls" meant that either
there are no values for the column in question or that the values were
all nulls (For minmax without NULL handling, which is where this all
started, these two things are essentially the same: the range is not to
be returned.  So this became a bug the instant I added handling for NULL
values.)  I failed to realize that these were two different things, and
this is likely the origin of all these troubles.

What do you think of using the unused bit in BrinTuple->bt_info to
denote a range that contains no heap tuples?  This also means we need it
in BrinMemTuple, I think we can do this:

@@ -44,6 +44,7 @@ typedef struct BrinValues
 typedef struct BrinMemTuple
 {
        bool            bt_placeholder; /* this is a placeholder tuple */
+       bool            bt_empty_range; /* range has no tuples */
        BlockNumber bt_blkno;           /* heap blkno that the tuple is for */
        MemoryContext bt_context;       /* memcxt holding the bt_columns values 
*/
        /* output arrays for brin_deform_tuple: */
@@ -69,7 +70,7 @@ typedef struct BrinTuple
         *
         * 7th (high) bit: has nulls
         * 6th bit: is placeholder tuple
-        * 5th bit: unused
+        * 5th bit: range has no tuples
         * 4-0 bit: offset of data
         * ---------------
         */
@@ -82,7 +83,7 @@ typedef struct BrinTuple
  * bt_info manipulation macros
  */
 #define BRIN_OFFSET_MASK               0x1F
-/* bit 0x20 is not used at present */
+#define BRIN_EMPTY_RANGE               0x20
 #define BRIN_PLACEHOLDER_MASK  0x40
 #define BRIN_NULLS_MASK                        0x80
 
(Note that bt_empty_range uses a hole in the struct, so there's no ABI
change.)

This is BRIN-tuple-level, not column-level, so conceptually it seems
more appropriate.  (In the case where both are empty in union_tuples, we
can return without entering the per-attribute loop at all, though I
admit it's not a very interesting case.)  This approach avoids having to
invent the strange combination of all+has to mean empty.


On 2023-Feb-24, Tomas Vondra wrote:

> I wonder what's the best
> way to test this in an automated way - it's very dependent on timing of
> the concurrent updated. For example we need to do something like this:
> 
>     T1: run pg_summarize_range() until it inserts the placeholder tuple
>     T2: do an insert into the page range (updates placeholder)
>     T1: continue pg_summarize_range() to merge into the placeholder
> 
> But there are no convenient ways to do this, I think. I had to check the
> various cases using breakpoints in gdb etc.

Yeah, I struggled with this during initial development but in the end
did nothing.  I think we would need to introduce some new framework,
perhaps Korotkov stop-events stuff at 
https://postgr.es/m/CAPpHfdsTeb+hBT5=qxghjng_chcjldanq9sdy9vnwbf2e2p...@mail.gmail.com
which seemed to me a good fit -- we would add a stop point after the
placeholder tuple is inserted.

> I'm not very happy with the union_tuples() changes - it's quite verbose,
> perhaps a bit too verbose. We have to check for empty ranges first, and
> then various combinations of allnulls/hasnulls flags for both BRIN
> tuples. There are 9 combinations, and the current code just checks them
> one by one - I was getting repeatedly confused by the original code, but
> maybe it's too much.

I think it's okay.  I tried to make it more compact (by saying "these
two combinations here are case 2, and these two other are case 4", and
keeping each of the other combinations a separate case; so there are
really 7 cases).  But that doesn't make it any easier to follow, on the
contrary it was more convoluted.  I think a dozen extra lines of source
is not a problem.

> The alternative is to apply the same fix to every BRIN_PROCNUM_UNION
> opclass procedure out there. I guess doing that for minmax+inclusion is
> not a huge deal, but what about external opclasses? And without the fix
> the indexes are effectively broken. Fixing this outside in brin.c (in
> the union procedure) fixes this for every opclass procedure, without any
> actual limitation of functinality (14+ does that anyway).

About the hypothetical question, you could as well ask what about
unicorns.  I have never seen any hint that any external opclass exist.
I am all for maintaining compatibility, but I think this concern is
overblown for BRIN.  Anyway, I think your proposed fix is better than
changing individual 'union' support procs, so it doesn't matter.

As far as I understood, you're now worried that there will be an
incompatibility because we will fail to call the 'union' procedure in
cases where we previously called it?  In other words, you fear that some
hypothetical opclass was handling the NULL values in some way that's
incompatible with this?  I haven't thought terribly hard about this, but
I can't see a way for this to cause incompatibilities.

> But maybe someone thinks this is a bad idea and we should do something
> else in the backbranches?

I think the new handling of NULLs in commit 72ccf55cb99c ("Move IS [NOT]
NULL handling from BRIN support functions") is better than what was
there before, so I don't object to backpatching it now that we know it's
necessary to fix a bug, and also we have field experience that the
approach is solid.

The attached patch is just a pointer to comments that I think need light
edition.  There's also a typo "bot" (for "both") in a comment that I
think would go away if you accept my suggestion to store 'empty' at the
tuple level.  Note that I worked with the REL_14_STABLE sources, because
for some reason I thought that that was the newest that needed
backpatching of 72ccf55cb99c, but now that I'm finishing this email I
realize that I should have used 13 instead /facepalm

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas    / desprovistas, por cierto
 de blandos atenuantes"                          (Patricio Vogel)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 95ed4ef362..0dddc6fa9c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -594,9 +594,9 @@ bringetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
                                        bval = &dtup->bt_columns[attno - 1];
 
                                        /*
-                                        * If the range has both allnulls and 
hasnulls set, it means
-                                        * there are no rows in the range, so 
we can skip it (we know
-                                        * there's nothing to match).
+                                        * If the BRIN tuple indicates that 
this range is empty,
+                                        * we can skip it: there's nothing to 
match.  We don't
+                                        * need to examine the next columns.
                                         */
                                        if (BRIN_RANGE_IS_EMPTY(bval))
                                        {
diff --git a/src/backend/access/brin/brin_tuple.c 
b/src/backend/access/brin/brin_tuple.c
index 7355e330f9..b3ba5ac365 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -607,8 +607,8 @@ brin_deform_tuple(BrinDesc *brdesc, BrinTuple *tuple, 
BrinMemTuple *dMemtuple)
 
                /*
                 * Make sure to overwrite the hasnulls flag, because it was 
initialized
-                * to true by brin_memtuple_initialize and we don't want to 
skip it if
-                * allnulls=true.
+                * to true by brin_memtuple_initialize and we don't want to 
skip [it] if
+                * allnulls=true. (XXX "it" what?)
                 */
                dtup->bt_columns[keyno].bv_hasnulls = hasnulls[keyno];
 

Reply via email to