Dear Amit, Alexander, > > Regarding assertion failure, I've found that assert in > > PhysicalConfirmReceivedLocation() conflicts with restart_lsn > > previously set by ReplicationSlotReserveWal(). As I can see, > > ReplicationSlotReserveWal() just picks fresh XLogCtl->RedoRecPtr lsn. > > So, it doesn't seems there is a guarantee that restart_lsn never goes > > backward. The commit in ReplicationSlotReserveWal() even states there > > is a "chance that we have to retry". > > > > I don't see how this theory can lead to a restart_lsn of a slot going > backwards. The retry mentioned there is just a retry to reserve the > slot's position again if the required WAL is already removed. Such a > retry can only get the position later than the previous restart_lsn.
We analyzed the assertion failure happened at pg_basebackup/020_pg_receivewal, and confirmed that restart_lsn can go backward. This meant that Assert() added by the ca307d5 can cause crash. Background =========== When pg_receivewal starts the replication and it uses the replication slot, it sets as the beginning of the segment where restart_lsn exists, as the startpoint. E.g., if the restart_lsn of the slot is 0/B000D0, pg_receivewal requests WALs from 0/B00000. More detail of this behavior, see f61e1dd2 and d9bae531. What happened here ================== Based on above theory, walsender sent from the beginning of segment (0/B00000). When walreceiver receives, it tried to send reply. At that time the flushed location of WAL would be 0/B00000. walsender sets the received lsn as restart_lsn in PhysicalConfirmReceivedLocation(). Here the restart_lsn went backward (0/B000D0->0/B00000). The assertion failure could happen if CHECKPOINT happened at that time. Attribute last_saved_restart_lsn of the slot was 0/B000D0, but the data.restart_lsn was 0/B00000. It could not satisfy the assertion added in InvalidatePossiblyObsoleteSlot(). Note ==== 1. In this case, starting from the beginning of the segment is not a problem, because the checkpoint process only removes WAL files from segments that precede the restart_lsn's wal segment. The current segment (0/B00000) will not be removed, so there is no risk of data loss or inconsistency. 2. A similar pattern applies to pg_basebackup. Both use logic that adjusts the requested streaming position to the start of the segment, and it replies the received LSN as flushed. 3. I considered the theory above, but I could not reproduce 040_standby_failover_slots_sync because it is a timing issue. Have someone else reproduced? We are still investigating failure caused at 040_standby_failover_slots_sync. [1]: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=scorpion&dt=2025-06-17%2000%3A40%3A46&stg=pg_basebackup-check Best regards, Hayato Kuroda FUJITSU LIMITED