Hi,

Thanks for working on this!


I think this should include a test that fails without this change and succeeds
with it...


On 2022-07-19 11:55:06 +0900, Kyotaro Horiguchi wrote:
> From abcf0a0e0b3e2de9927d8943a3e3c145ab189508 Mon Sep 17 00:00:00 2001
> From: Kyotaro Horiguchi <horikyota....@gmail.com>
> Date: Tue, 19 Jul 2022 11:50:29 +0900
> Subject: [PATCH v6] Create correct snapshot during CREATE_REPLICATION_SLOT

This sees a tad misleading - the previous snapshot wasn't borken, right?


> +/*
> + * ReorderBufferXidIsKnownSubXact
> + *           Returns true if the xid is a known subtransaction.
> + */
> +bool
> +ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid)
> +{
> +     ReorderBufferTXN *txn;
> +
> +     txn = ReorderBufferTXNByXid(rb, xid, false,
> +                                                             NULL, 
> InvalidXLogRecPtr, false);
> +
> +     /* a known subtxn? */
> +     if (txn && rbtxn_is_known_subxact(txn))
> +             return true;
> +
> +     return false;
> +}

The comments here just seem to restate the code....


It's not obvious to me that it's the right design (or even correct) to ask
reorderbuffer about an xid being a subxid. Maybe I'm missing something, but
why would reorderbuffer even be guaranteed to know about all these subxids?


> @@ -568,9 +571,17 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>
>       MyProc->xmin = snap->xmin;
>
> -     /* allocate in transaction context */
> +     /*
> +      * Allocate in transaction context.
> +      *
> +      * We could use only subxip to store all xids (takenduringrecovery
> +      * snapshot) but that causes useless visibility checks later so we 
> hasle to
> +      * create a normal snapshot.
> +      */

I can't really parse this comment at this point, and I seriously doubt I could
later on.


> @@ -591,12 +605,24 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>
>               if (test == NULL)
>               {
> -                     if (newxcnt >= GetMaxSnapshotXidCount())
> -                             ereport(ERROR,
> -                                             
> (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
> -                                              errmsg("initial slot snapshot 
> too large")));
> -
> -                     newxip[newxcnt++] = xid;
> +                     /* Store the xid to the appropriate xid array */
> +                     if (ReorderBufferXidIsKnownSubXact(builder->reorder, 
> xid))
> +                     {
> +                             if (!overflowed)
> +                             {
> +                                     if (newsubxcnt >= 
> GetMaxSnapshotSubxidCount())
> +                                             overflowed = true;
> +                                     else
> +                                             newsubxip[newsubxcnt++] = xid;
> +                             }
> +                     }
> +                     else
> +                     {
> +                             if (newxcnt >= GetMaxSnapshotXidCount())
> +                                     elog(ERROR,
> +                                              "too many transactions while 
> creating snapshot");
> +                             newxip[newxcnt++] = xid;
> +                     }
>               }

Hm, this is starting to be pretty deeply nested...


I wonder if a better fix here wouldn't be to allow importing a snapshot with a
larger ->xid array. Yes, we can't do that in CurrentSnapshotData, but IIRC we
need to be in a transactional snapshot anyway, which is copied anyway?

Greetings,

Andres Freund


Reply via email to