On Thu, Jul 28, 2011 at 09:47:05AM +0200, Kevin Wolf wrote: > > Indeed. This has come up a few times, and actually is a mostly trivial > > task. Maybe we should give up waiting for -blockdev and separate cache > > mode settings and allow a nocache-writethrough or similar mode now? It's > > going to be around 10 lines of code + documentation. > > I understand that there may be reasons for using O_DIRECT | O_DSYNC, but > what is the explanation for O_DSYNC improving performance?
There isn't any, at least for modern Linux. O_DSYNC at this point is equivalent to a range fdatasync for each write call, and given that we're doing O_DIRECT the ranges flush doesn't matter. If you do have a modern host and an old guest it might end up beeing faster because the barrier implementation in Linux used to suck so badly, but that's not inhrent to the I/O model. If you guest however doesn't support cache flushes at all O_DIRECT | O_DSYNC is the only sane model to use for local filesystems and block devices. > Christoph, on another note: Can we rely on Linux AIO never returning > short writes except on EOF? Currently we return -EINVAL in this case, so > I hope it's true or we wouldn't return the correct error code. More or less. There's one corner case for all Linux I/O, and that is only writes up to INT_MAX are supported, and larger writes (and reads) get truncated to it. It's pretty nasty, but Linux has been vocally opposed to fixing this issue.