Hi, > 1. I've removed several calls to PgArchForceDirScan() in favor of > calling it at the top of pgarch_ArchiverCopyLoop(). I believe > there is some disagreement about this change, but I don't think > we gain enough to justify the complexity. The main reason we > exit pgarch_ArchiverCopyLoop() should ordinarily be that we've > run out of files to archive, so incurring a directory scan the > next time it is called doesn't seem like it would normally be too > bad. I'm sure there are exceptions (e.g., lots of .done files, > archive failures), but the patch is still not making things any > worse than they presently are for these cases.
Yes, I think when archiver is lagging behind then a call to force directory scan at the top of pgarch_ArchiverCopyLoop() does not have any impact. This may result into a directory scan in next cycle only when the archiver is ahead or in sync but in that case also a directory scan may not incur too much cost since the archiver is ahead.I agree that we can remove the separate calls to force a directory scan in failure scenarios with a single call at the top of PgArchForceDirScan(). > 2. I removed all the logic that attempted to catch out-of-order > .ready files. Instead, XLogArchiveNotify() only forces a > directory scan for files other than regular WAL files, and we > depend on our periodic directory scans to pick up anything that's > been left behind. > 3. I moved the logic that forces directory scans every once in a > while. We were doing that in the checkpoint/restartpoint logic, > which, upon further thought, might not be the best idea. The > checkpoint interval can vary widely, and IIRC we won't bother > creating checkpoints at all if database activity stops. Instead, > I've added logic in pgarch_readyXlog() that forces a directory > scan if one hasn't happened in a few minutes. > 4. Finally, I've tried to ensure comments are clear and that the > logic is generally easy to reason about. > > What do you think? I agree, If we force a periodic directory scan then we may not require any special logic for handling scenarios where a .ready file is created out of order in XLogArchiveNotify(). We need to force a directory scan only in case of a non-regular WAL file in XLogArchiveNotify(). Overall I think the periodic directory scan simplifies the patch and makes sure that any missing file gets archived within a few mins. Thanks, Dipesh