On Sat, Apr 16, 2011 at 7:24 AM, Robert Haas <robertmh...@gmail.com> wrote: > The OP says that this patch maintains the WAL-before-data rule without any > explanation of how it accomplishes that seemingly quite amazing feat. I > assume I'm going to have to read this patch at some point to refute this > assertion, and I think that sucks. I am pretty nearly 100% confident that > this approach is utterly doomed, and I don't want to spend a lot of time on > it unless someone can provide me with a compelling explanation of why my > confidence is misplaced.
Fwiw he did explain how he did that. Or at least I think he did -- it's possible I read what I expected because what he came up with is something I've recently been thinking about. What he did, I gather, is treat the mmapped buffers as a read-only copy of the data. To actually make any modifications he copies it into shared buffers and treats them like normal. When the buffers get flushed from memory they get written and then the pointers get repointed back at the mmapped copy. Effectively this means the shared buffers get extended to include all of the filesystem cache instead of having to evict buffers from shared buffers just because you want to read another one that's already in filesystem cache. It doesn't save the copying between filesystem cache and shared buffers for buffers that are actually being written to. But it does save some amount of other copies on read-only traffic and it can even save some i/o. It does require a function call before each buffer modification where the pattern is currently <lock buffer>, <mutate buffer>, <mark buffer dirty>. From what he describes he needs to add a <prepare buffer for mutation> between the lock and mutate. I think it's an interesting experiment and it's good to know how to solve some of the subproblems. Notably, how do you extend files or drop them atomically across processes? And how do you deal with getting the mappings to be the same across all the processes or deal with them being different? But I don't think it's a great long-term direction. It just seems clunky to have to copy things from mmapped buffers to local buffers and back. Perhaps the performance testing will show that clunkiness is well worth it but we'll need to see that for a wide variety of workloads to judge that. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers