On 2/21/25 07:17, Masahiko Sawada wrote:
On Thu, Feb 20, 2025 at 12:50 PM Markus Wanner
<markus.wan...@enterprisedb.com> wrote:
Hi,
I recently stumbled over an issue with an unintentional re-transmission.
While this clearly was our fault in the output plugin code, I think the
walsender's exposed API could easily be hardened to prevent the bad
consequence from this mistake.
Does anything speak against the attached one line patch?
According to the documentation[1], OutputPluginPrepareWrite() has to
be called before OutputPluginWrite().
Sure, that's exactly what we forgot. My complaint is that it's a mistake
that's unnecessarily hard to spot and the API could be improved. There
is no good reason to keep the sent data in the ctx->out buffer. But
clearly, there's some danger to it.
(Note that it wasn't quite as simple as you may think, though. With an
error involved in the send/recv loop of the walsender invoked from
OutputPluginWrite. The error handling code in PG_CATCH then attempting
to flush the queue with OutputPluginWrite.)
Arguably, one could even say that replacing the call to resetStringInfo
in WalSndPrepareWrite with an Assert(ctx->out->len == 0) would catch a
variant of improper use. And another Assert in WalSndWriteData to check
OutputPluginPrepareWrite had properly been invoked before can't hurt,
either.
Best Regards
Markus
From 5b9d6285252204d7109001fc42323add00ac773f Mon Sep 17 00:00:00 2001
From: Markus Wanner <markus.wan...@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:00:28 +0100
Subject: [PATCH] Add asserts to guide to proper API usage
---
src/backend/replication/walsender.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 96884ce152f..b5d7baf83ad 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1523,7 +1523,7 @@ WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xi
if (!last_write)
lsn = InvalidXLogRecPtr;
- resetStringInfo(ctx->out);
+ Assert(ctx->out->len == 0);
pq_sendbyte(ctx->out, 'w');
pq_sendint64(ctx->out, lsn); /* dataStart */
@@ -1549,6 +1549,8 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid,
{
TimestampTz now;
+ Assert(ctx->out->len >= 1 + 3 * sizeof(int64);
+
/*
* Fill the send timestamp last, so that it is taken as late as possible.
* This is somewhat ugly, but the protocol is set as it's already used for
--
2.39.5