On Thu, May 8, 2025 at 6:04 PM Zhijie Hou (Fujitsu) wrote: > > On Tue, May 6, 2025 at 7:22 PM Zhijie Hou (Fujitsu) wrote: > > > > > On Mon, May 5, 2025 at 6:59 PM Amit Kapila wrote: > > > > > > On Sun, May 4, 2025 at 2:33 PM Masahiko Sawada > > <sawada.m...@gmail.com> > > > wrote: > > > > > > > > While I cannot be entirely certain of my analysis, I believe the > > > > root cause might be related to the backward movement of the > > > > confirmed_flush LSN. The following scenario seems possible: > > > > > > > > 1. The walsender enables the two_phase and sets two_phase_at > > > > (which should be the same as confirmed_flush). > > > > 2. The slot's confirmed_flush regresses for some reason. > > > > 3. The slotsync worker retrieves the remote slot information and > > > > enables two_phase for the local slot. > > > > > > > > > > Yes, this is possible. Here is my theory as to how it can happen > > > in the current case. In the failed test, after the primary has > > > prepared a transaction, the transaction won't be replicated to the > > > subscriber as two_phase was not enabled for the slot. However, > > > subsequent keepalive messages can send the latest WAL location to > > > the subscriber and get the confirmation of the same from the > > > subscriber without its origin being moved. Now, after we restart > > > the apply worker (due to disable/enable for a subscription), it > > > will use the previous origin_lsn to temporarily move back the > > > confirmed flush LSN as explained in one of the previous emails in another > > > thread [1]. > > > During this temporary movement of confirm flush LSN, the slotsync > > > worker fetches the two_phase_at and confirm_flush_lsn values, > > > leading to the assertion failure. We see this issue intermittently > > > because it depends on the > > timing of slotsync worker's request to fetch the slot's value. > > > > Based on this theory, I can reproduce the BF failure in the 040 > > tap-test on HEAD after applying the 0001 patch. This is achieved by > > using the injection point to stop the walsender from sending a > > keepalive before receiving the old origin position from the apply > > worker, ensuring the confirmed_flush consistently moves backward > > before > slotsync. > > > > Additionally, I've reproduced the duplicate data issue on HEAD > > without slotsync using the attached script (after applying the injection > > point patch). > > This issue arises if we immediately disable the subscription after > > the confirm_flush_lsn moves backward, preventing the walsender from > > advancing the confirm_flush_lsn. > > > > In this case, if a prepared transaction exists before two_phase_at, > > then after re-enabling the subscription, it will replicate that > > prepared transaction when decoding the PREPARE record and replicate > > that again when decoding the COMMIT PREPARED record. In such cases, > > the apply worker keeps reporting the error: > > > > ERROR: transaction identifier "pg_gid_16387_755" is already in use. > > > > Apart from above, we're investigating whether the same issue can > > occur in back-branches and will share the results once ready. > > I reproduced the duplicate data issue on PG17 as well using the > attached shell script. Since PG17 doesn’t allow altering the twophase > option, I created a subscription with two_phase=on and copy_data=on. I > prepared a transaction before the table synchronization was ready, at > a time when the slot's two_phase hadn't been set to true. This setup > can cause in the prepared transaction being replicated twice after > restarting the apply worker and the confirmed_flush_lsn move backwards. > > To ensure the origin position is initialized during table sync, I > inserted some data before the prepared transaction. I added injection > points(0001) to manage the table sync worker's process, allowing the > apply worker to replicate some changes and update the origin position > while table sync was ongoing.
The above reproduction of the issue indicates that it has been present since at least PG15, when the twophase subscription option was introduced. I am currently investigating whether the issue occurs without the twophase option. If it does, the fix will need to be applied to all supported branches. I will share the results once they are available. Best Regards, Hou zj