Hi, On 2021-05-03 16:49:16 -0400, Robert Haas wrote: > I have two possible ideas for addressing this; perhaps other people > will have further suggestions. A relatively non-invasive fix would be > to teach pgarch.c how to increment a WAL file name. After archiving > segment N, check using stat() whether there's an .ready file for > segment N+1. If so, do that one next. If not, then fall back to > performing a full directory scan.
Hm. I wonder if it'd not be better to determine multiple files to be archived in one readdir() pass? > As far as I can see, this is just cheap insurance. If archiving is > keeping up, the extra stat() won't matter much. If it's not, this will > save more system calls than it costs. Since during normal operation it > shouldn't really be possible for files to show up in pg_wal out of > order, I don't really see a scenario where this changes the behavior, > either. If there are gaps in the sequence at startup time, this will > cope with it exactly the same as we do now, except with a better > chance of finishing before I retire. There's definitely gaps in practice :(. Due to the massive performance issues with archiving there are several tools that archive multiple files as part of one archive command invocation (and mark the additional archived files as .done immediately). > However, that's still pretty wasteful. Every time we have to wait for > the next file to be ready for archiving, we'll basically fall back to > repeatedly scanning the whole directory, waiting for it to show up. Hm. That seems like it's only an issue because .done and .ready are in the same directory? Otherwise the directory would be empty while we're waiting for the next file to be ready to be archived. I hate that that's a thing but given teh serial nature of archiving, with high per-call overhead, I don't think it'd be ok to just break that without a replacement :(. > But perhaps we could work around this by allowing pgarch.c to access > shared memory, in which case it could examine the current timeline > whenever it wants, and probably also whatever LSNs it needs to know > what's safe to archive. FWIW, the shared memory stats patch implies doing that, since the archiver reports stats. > If we did that, could we just get rid of the .ready and .done files > altogether? Are they just a really expensive IPC mechanism to avoid a > shared memory connection, or is there some more fundamental reason why > we need them? What kind of shared memory mechanism are you thinking of? Due to timelines and history files I don't think simple position counters would be quite enough. I think the aforementioned "batching" archive commands are part of the problem :(. > And is there any good reason why the archiver shouldn't be connected > to shared memory? It is certainly nice to avoid having more processes > connected to shared memory than necessary, but the current scheme is > so inefficient that I think we end up worse off. I think there is no fundamental for avoiding shared memory in the archiver. I guess there's a minor robustness advantage, because the forked shell to start the archvive command won't be attached to shared memory. But that's only until the child exec()s to the archive command. There is some minor performance advantage as well, not having to process the often large and contended memory mapping for shared_buffers is probably measurable - but swamped by the cost of needing to actually archive the segment. My only "concern" with doing anything around this is that I think the whole approach of archive_command is just hopelessly broken, with even just halfway busy servers only able to keep up archiving if they muck around with postgres internal data during archive command execution. Add to that how hard it is to write a robust archive command (e.g. the one in our docs still suggests test ! -f && cp, which means that copy failing in the middle yields an incomplete archive)... While I don't think it's all that hard to design a replacement, it's however likely still more work than addressing the O(n^2) issue, so ... Greetings, Andres Freund