Anthony Liguori wrote: > qemu-img create -f raw foo.img 10G > mkfs.ext3 foo.img > mount -oloop,rw,barrier=1 -t ext3 foo.img mnt > > Works perfectly fine.
Hmm, interesting. Didn't know loop propagated barriers. So you're suggesting to use qemu with a loop device, and ext2 (bit faster than ext3) and barrier=0 (well, that's implied if you use ext2), and a raw image file on the ext2/3 filesystem, to provide the effect of flush=off, becuase the loop device caches block writes on the host, except for explicit barrier requests from the fs, which are turned off? That wasn't obvious the first time :-) Does the loop device cache fs writes instead of propagating them immediately to the underlying fs? I guess it probably does. Does the loop device allow the backing file to grow sparsely, to get behavious like qcow2? That's ugly but it might just work. > >2. barrier=0 does _not_ provide the cache=off behaviour. It only > >disables barriers; it does not prevent writing to the disk hardware. > > The proposal has nothing to do with cache=off. Sorry, I meant flush=off (the proposal). Mounting the host filesystem (i.e. not using a loop device anywhere) with barrier=0 doesn't have even close to the same effect. > >>The problem with options added for developers is that those options are > >>very often accidentally used for production. > >> > >We already have risky cache= options. Also, do we call fdatasync > >(with barrier) on _every_ write for guests which disable the > >emulated disk cache? > > None of our cache= options should result in data corruption on power > loss. If they do, it's a bug. (I might have the details below a bit off.) If cache=none uses O_DIRECT without calling fdatasync for guest barriers, then it will get data corruption on power loss. If cache=none does call fdatasync for guest barriers, then it might still get corruption on power loss; I am not sure if recent Linux host behaviour of O_DIRECT+fdatasync (with no buffered writes to commit) issues the necessary barriers. I am quite sure that older kernels did not. cache=writethrough will get data corruption on power loss with older Linux host kernels. O_DSYNC did not issue barriers. I'm not sure if the behaviour of O_DSYNC that was recently changed is now issuing barriers after every write. Provided all the cache= options call fdatasync/fsync when the guest issues a cache flush, and call fdatasync/fsync following _every_ write when the guest has disabled the emulated write cache, that should be as good as Qemu can reasonably do. It's up to the host from there. -- Jamie