Tom, first of all, excellent job improving the current algorithm. I'm glad you look at the WALCommitLock code.
> This must be so because the backends that are > released at the end of any given disk revolution will not be able to > participate in the next group commit, if there is already at least > one backend ready to commit. This is the major reason for my original suggestion about using aio_write. The writes don't block each other and there is no need for a kernel level exclusive locking call like fsync or fdatasync. Even the theoretical limit you mention of one transaction per revolution per committing process seem like a significant bottleneck. Is committing 1 and 4 transactions on every revolution good? It's certainly better than 1 per revolution. However, what if we could have done 3 transactions per process in the time it took for a single revolution? Then we are looking at (1 + 4)/ 2 = 2.5 transactions per revolution versus the theoretical maximum of (3 * 5) = 15 transactions per revolution if we can figure out a way to do non-blocking writes that we can guarantee are on the disk platter so we can return from commit. Separating out whether or not aio is viable. Do you not agree that eliminating the blocking would result in potentially a 6X improvement for the 5 process case? > > So this solution isn't perfect; it would still be nice to have a way to > delay initiation of the WAL write until "just before" the disk is ready > to accept it. I dunno any good way to do that, though. I still think that it would be much faster to just keep writing the WAL log blocks when they fill up and have a separate process wake the commiting process when the write completes. This would eliminate WAL writing as a bottleneck. I have yet to hear anyone say that this can't be done, only that we might not want to do it because the code might not be clean. I'm generally only happy when I can finally remove a bottleneck completely, but speeding one up by 3X like you have done is pretty damn cool for a day or two's work. - Curtis ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]