On Tue, Aug 24, 2021 at 1:26 PM Bossart, Nathan <bossa...@amazon.com> wrote: > I think Horiguchi-san made a good point that the .ready file creators > should ideally not need to understand archiving details. However, I > think this approach requires them to be inextricably linked. In the > happy case, the archiver will follow the simple path of processing > each consecutive WAL file without incurring a directory scan. Any > time there is something other than a regular WAL file to archive, we > need to take special action to make sure it is picked up.
I think they should be inextricably linked, really. If we know something - like that there's a file ready to be archived - then it seems like we should not throw that information away and force somebody else to rediscover it through an expensive process. The whole problem here comes from the fact that we're using the filesystem as an IPC mechanism, and it's sometimes a very inefficient one. I can't quite decide whether the problems we're worrying about here are real issues or just kind of hypothetical. I mean, today, it seems to be possible that we fail to mark some file ready for archiving, emit a log message, and then a huge amount of time could go by before we try again to mark it ready for archiving. Are the problems we're talking about here objectively worse than that, or just different? Is it a problem in practice, or just in theory? I really want to avoid getting backed into a corner where we decide that the status quo is the best we can do, because I'm pretty sure that has to be the wrong conclusion. If we think that get-a-bunch-of-files-per-readdir approach is better than the keep-trying-the-next-file approach, I mean that's OK with me; I just want to do something about this. I am not sure whether or not that's the right course of action. -- Robert Haas EDB: http://www.enterprisedb.com