Dear Horiguchi-san,

> If I'm grabbing the discussion here correctly, in my memory, it is
> because: physical replication needs all records that have written on
> primary are written on standby for switchover to succeed. It is
> annoying that normal shutdown occasionally leads to switchover
> failure. Thus WalSndDone explicitly waits for remote flush/write
> regardless of the setting of synchronous_commit.

AFAIK the condition (sentPtr == replicatedPtr) seemed to be introduced for the 
purpose[1].
You meant to say that the conditon (!pq_is_send_pending()) has same motivation, 
right?

> Thus apply delay
> doesn't affect shutdown (AFAICS), and that is sufficient since all the
> records will be applied at the next startup.

I was not clear the word "next startup", but I agreed that we can shut down the
walsender in case of recovery_min_apply_delay > 0 and synchronous_commit = 
remote_apply.
The startup process will be not terminated even if the primary crashes, so I
think the process will apply the transaction sooner or later.

> In logical replication apply preceeds write and flush so we have no
> indication whether a record is "replicated" to standby by other than
> apply LSN. On the other hand, logical recplication doesn't have a
> business with switchover so that assurarance is useless. Thus I think
> we can (practically) ignore apply_lsn at shutdown. It seems subtly
> irregular, though.

Another consideration is that the condition (!pq_is_send_pending()) ensures that
there are no pending messages, including other packets. Currently we force 
walsenders
to clean up all messages before shutting down, even if it is a keepalive one.
I cannot have any problems caused by this, but I can keep the condition in case 
of
logical replication.

I updated the patch accordingly. Also, I found that the previous version
did not work well in case of streamed transactions. When a streamed transaction
is committed on publisher but the application is delayed on subscriber, the
process sometimes waits until there is no pending write. This is done in
ProcessPendingWrites(). I added another termination path in the function.

[1]: 
https://github.com/postgres/postgres/commit/985bd7d49726c9f178558491d31a570d47340459

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment: v4-0001-Exit-walsender-before-confirming-remote-flush-in-.patch
Description: v4-0001-Exit-walsender-before-confirming-remote-flush-in-.patch

Reply via email to