Hi, I took a brief look at the patch.
For a motivation aspect I can see this being useful synchronous_replicas if you have commit set to flush mode. So +1 on feature, easier configurability, although thinking about it more you could probably have the restore script be smarter and provide non-zero exit codes periodically. The patch needs to be rebased but I tested this against an older 17 build. > + ereport(DEBUG1, > + errmsg_internal("switched WAL source from %s to %s after %s", > + xlogSourceNames[oldSource], Not sure if you're intentionally changing from DEBUG1 from DEBUG2. > * standby and increase the replication lag on primary. Do you mean "increase replication lag on standby"? nit: reading from archive *could* be faster since you could in theory it's not single-processed/threaded. > However, > + * exhaust all the WAL present in pg_wal before switching. If successful, > + * the state machine moves to XLOG_FROM_STREAM state, otherwise it falls > + * back to XLOG_FROM_ARCHIVE state. I think I'm missing how this happens. Or what "successful" means. If I'm reading it right, no matter what happens we will always move to XLOG_FROM_STREAM based on how the state machine works? I tested this in a basic RR setup without replication slots (e.g. log shipping) where the WAL is available in the archive but the primary always has the WAL rotated out and 'streaming_replication_retry_interval = 1'. This leads the RR to become stuck where it stops fetching from archive and loops between XLOG_FROM_PG_WAL and XLOG_FROM_STREAM. When 'streaming_replication_retry_interval' is breached, we transition from {currentSource, wal_source_switch_state} {XLOG_FROM_ARCHIVE, SWITCH_TO_STREAMING_NONE} -> {XLOG_FROM_ARCHIVE, SWITCH_TO_STREAMING_PENDING} with readFrom = XLOG_FROM_PG_WAL. That reads the last record successfully in pg_wal and then fails to read the next one because it doesn't exist, transitioning to {XLOG_FROM_STREAM, SWITCH_TO_STREAMING_PENDING}. XLOG_FROM_STREAM fails because the WAL is no longer there on primary, it sets it back to {XLOG_FROM_ARCHIVE, SWITCH_TO_STREAMING_PENDING}. > last_fail_time = now; > currentSource = XLOG_FROM_ARCHIVE; > break; Since the state is still SWITCH_TO_STREAMING_PENDING from the previous loops, it forces > Assert(currentSource == XLOG_FROM_ARCHIVE); > readFrom = XLOG_FROM_PG_WAL; > ... > readFile = XLogFileReadAnyTLI(readSegNo, DEBUG2, readFrom); And this readFile call seems to always succeed since it can read the latest WAL record but not the next one, which is in archive, leading to transition back to XLOG_FROM_STREAMING and repeats. > /* > * Nope, not found in archive or pg_wal. > */ > lastSourceFailed = true; I don't think this gets triggered for XLOG_FROM_PG_WAL case, which means the safety check you added doesn't actually kick in. > if (wal_source_switch_state == SWITCH_TO_STREAMING_PENDING) > { > wal_source_switch_state = SWITCH_TO_STREAMING; > elog(LOG, "SWITCH_TO_STREAMING_PENDING TO > SWITCH_TO_STREAMING"); > } Thanks -- John Hsu - Amazon Web Services