On Tue, Apr 10, 2018 at 1:44 PM, Craig Ringer <cr...@2ndquadrant.com> wrote: > On 10 April 2018 at 03:59, Andres Freund <and...@anarazel.de> wrote: >> I don't think that's as hard as some people argued in this thread. We >> could very well open a pipe in postmaster with the write end open in >> each subprocess, and the read end open only in checkpointer (and >> postmaster, but unused there). Whenever closing a file descriptor that >> was dirtied in the current process, send it over the pipe to the >> checkpointer. The checkpointer then can receive all those file >> descriptors (making sure it's not above the limit, fsync(), close() ing >> to make room if necessary). The biggest complication would presumably >> be to deduplicate the received filedescriptors for the same file, >> without loosing track of any errors. > > Yep. That'd be a cheaper way to do it, though it wouldn't work on > Windows. Though we don't know how Windows behaves here at all yet. > > Prior discussion upthread had the checkpointer open()ing a file at the > same time as a backend, before the backend writes to it. But passing > the fd when the backend is done with it would be better.
How would that interlock with concurrent checkpoints? I can see how to make that work if the share-fd-or-fsync-now logic happens in smgrwrite() when called by FlushBuffer() while you hold io_in_progress, but not if you defer it to some random time later. -- Thomas Munro http://www.enterprisedb.com