Hi, On 2019-01-23 18:45:42 +0200, Heikki Linnakangas wrote: > To re-iterate what I said earlier in this thread, I think the next step here > is to write a patch that modifies xlog.c to use plain old mmap()/msync() to > memory-map the WAL files, to replace the WAL buffers. Let's see what the > performance of that is, with or without NVM hardware. I think that might > actually make the code simpler. There's a bunch of really hairy code around > locking the WAL buffers, which could be made simpler if each backend > memory-mapped the WAL segment files independently. > > One thing to watch out for, is that if you read() a file, and there's an I/O > error, you have a chance to ereport() it. If you try to read from a > memory-mapped file, and there's an I/O error, the process is killed with > SIGBUS. So I think we have to be careful with using memory-mapped I/O for > reading files. But for writing WAL files, it seems like a good fit. > > Once we have a reliable mmap()/msync() implementation running, it should be > straightforward to change it to use MAP_SYNC and the special CPU > instructions for the flushing.
FWIW, I don't think we should go there as the sole implementation. I'm fairly convinced that we're going to need to go to direct-IO in more cases here, and that'll not work well with mmap. I think this'd be a worthwhile experiment, but I'm doubtful it'd end up simplifying our code. Greetings, Andres Freund