Greg Smith <g...@2ndquadrant.com> writes: > So my guess is that some small percentage of Windows users might notice > a change here, and some testing on FreeBSD would be useful too. That's > about it for platforms that I think anybody needs to worry about.
To my mind, O_DIRECT is not really the key issue here, it's whether to prefer O_DSYNC or fdatasync. I looked back in the archives, and I think that the main reason we prefer O_DSYNC when available is the results I got here: http://archives.postgresql.org/pgsql-hackers/2001-03/msg00381.php which demonstrated a performance benefit on HPUX 10.20, though with a test tool much more primitive than test_fsync. I still have that machine, although the disk that was in it at the time died awhile back. What's in there now is a Seagate ST336607LW spinning at 10000 RPM (166 rev/sec) and today I get numbers like this from test_fsync: Simple write: 8k write 28331.020/second Compare file sync methods using one write: open_datasync 8k write 161.190/second open_sync 8k write 156.478/second 8k write, fdatasync 54.302/second 8k write, fsync 51.810/second Compare file sync methods using two writes: 2 open_datasync 8k writes 81.702/second 2 open_sync 8k writes 80.172/second 8k write, 8k write, fdatasync 40.829/second 8k write, 8k write, fsync 39.836/second Compare open_sync with different sizes: open_sync 16k write 80.192/second 2 open_sync 8k writes 78.018/second Test if fsync on non-write file descriptor is honored: (If the times are similar, fsync() can sync data written on a different descriptor.) 8k write, fsync, close 52.527/second 8k write, close, fsync 54.092/second So *on that rather ancient platform* there's a measurable performance benefit to O_DSYNC, but this seems to be largely because fdatasync is stubbed to fsync in userspace rather than because fdatasync wouldn't be a better idea in the abstract. Also, a lot of the argument against fsync at the time was that it forced the kernel to iterate through all the buffers for the WAL file to see if any were dirty. I would imagine that modern kernels are a tad smarter about that; and even if they aren't, the CPU speed versus disk speed tradeoff has changed enough since 2001 that iterating through 16MB of buffers isn't as interesting as it was then. So to my mind, switching to the preference order fdatasync, fsync_writethrough, fsync seems like the thing to do. Since we assume fsync is always available, that means that O_DSYNC/O_SYNC will not be the defaults on any platform. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers