On Mon, Feb 19, 2024 at 12:14 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > Does the primary error message really need to say "could not sync > > slot"? If it will be obvious from context that we were trying to sync > > a slot, then it would be fine to just say "ERROR: remote slot precedes > > local slot". > > As this is a LOG message, I feel one may need some more information on > the context but it is not mandatory.
I'll defer to you. > > But I also don't quite understand what problem this is trying to > > report. Is this slot-syncing code running on the primary or the > > standby? If it's running on the primary, then surely it's expected > > that the remote slot will precede the local one. And if it's running > > on the standby, then the comments in > > update_and_persist_local_synced_slot about waiting for the remote side > > to catch up seem quite confusing, because surely we're chasing the > > primary and not the other way around? > > The local's restart_lsn could be ahead of than primary's for the very > first sync when the WAL corresponding to the remote's restart_lsn is > not available on standby (say due to a different wal related settings > the required WAL has been removed when we first time tried to sync the > slot). For more details, you can refer to comments atop slotsync.c > starting from "If the WAL corresponding to the remote's restart_lsn > ..." So why do we log a message about this? > > But if we ignore all of that, then we could just do this: > > > > ERROR: could not sync slot information as remote slot precedes local slot > > DETAIL: Remote slot has LSN %X/%X and catalog xmin %u, but remote slot > > has LSN %X/%X and catalog xmin %u. > > > > This looks good to me but instead of ERROR here we want to use LOG. Fair enough! -- Robert Haas EDB: http://www.enterprisedb.com