On Thu, May 3, 2012 at 1:27 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > Why not switch to 1 WAL record per file, rather than 1 per page. (32 > pages, IIRC). > > We can then have the whole new file written as zeroes by a background > process, which needn't do that while holding the XidGenLock.
I thought about doing a single record covering a larger number of pages, but that would be an even bigger hit if it were ever to occur in the foreground path, so you'd want to be very sure that the background process was going to absorb all the work. And if the background process is going to absorb all the work, then I'm not sure it matters very much whether we emit one xlog record or 32. After all it's pretty low volume compared to all the other xlog traffic. Maybe there's some room for optimization here, but it doesn't seem like the first thing to pursue. Doing it a background process, though, may make sense. What I'm a little worried about is that - on a busy system - we've only got about 2 seconds to complete each CLOG extension, and we must do an fsync in order to get there. And the fsync can easily take a good chunk of (or even more than) that two seconds. So it's possible that saddling the bgwriter with this responsibility would be putting too many eggs in one basket. We might find that under the high-load scenarios where this is supposed to help, bgwriter is already too busy doing other things, and it doesn't get around to extending CLOG quickly enough. Or, conversely, we might find that it does get around to extending CLOG quickly enough, but consequently fails to carry out its regular duties. We could of course add a NEW background process just for this purpose, but it'd be nicer if we didn't have to go that far. > My earlier patch to do background flushing from bgwriter can be > extended to do that. I've just been looking at that patch again, since as we discussed before commit 3ae5133b1cf478d516666f2003bc68ba0edb84c7 fixed a problem in this area, and it may be that we can now show a benefit of this approach where we couldn't before. I think it's separate from what we're discussing here, so let me write more about that on another thread after I poke at it a little more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers