> On Sun, 23 Jun 2002, Bruce Momjian wrote: >> Yes, I don't see writing to two files vs. one to be any win, especially >> when we need to fsync both of them. What I would really like is to >> avoid the double I/O of writing to WAL and to the data file; improving >> that would be a huge win.
I don't believe it's possible to eliminate the double I/O. Keep in mind though that in the ideal case (plenty of shared buffers) you are only paying two writes per modified block per checkpoint interval --- one to the WAL during the first write of the interval, and then a write to the real datafile issued by the checkpoint process. Anything that requires transaction commits to write data blocks will likely result in more I/O not less, at least for blocks that are modified by several successive transactions. The only thing I've been able to think of that seems like it might improve matters is to make the WAL writing logic aware of the layout of buffer pages --- specifically, to know that our pages generally contain an uninteresting "hole" in the middle, and not write the hole. Optimistically this might reduce the WAL data volume by something approaching 50%; though pessimistically (if most pages are near full) it wouldn't help much. This was not very feasible when the WAL code was designed because the buffer manager needed to cope with both normal pages and pg_log pages, but as of 7.2 I think it'd be safe to assume that all pages have the standard layout. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])