On 5/28/13 11:42 PM, "Stefan Hajnoczi" <stefa...@gmail.com> wrote:
>On Tue, May 28, 2013 at 06:00:08PM +0000, Mark Trumpold wrote: >> >> >-----Original Message----- >> >From: Stefan Hajnoczi [mailto:stefa...@gmail.com] >> >Sent: Monday, May 27, 2013 05:36 AM >> >To: 'Mark Trumpold' >> >Cc: 'Paolo Bonzini', qemu-devel@nongnu.org, ma...@tachyon.net >> >Subject: Re: 'qemu-nbd' explicit flush >> > >> >On Sat, May 25, 2013 at 09:42:08AM -0800, Mark Trumpold wrote: >> >> On 5/24/13 1:05 AM, "Stefan Hajnoczi" <stefa...@gmail.com> wrote: >> >> >On Thu, May 23, 2013 at 09:58:31PM +0000, Mark Trumpold wrote: >> >> >One thing to be careful of is whether these operations are >>asynchronous. >> >> >The signal is asynchronous, you have no way of knowing when >>qemu-nbd is >> >> >finished flushing to the physical disk. >> >> >> >> Right, of course. I missed the obvious. >> > >> >I missed something too. Paolo may have already hinted at this when he >> >posted a dd oflag=sync command-line option: >> > >> >blockdev --flushbufs is the wrong tool because ioctl(BLKFLSBUF) only >> >writes out dirty pages to the block device. It does *not* guarantee to >> >send a flush request to the device. >> > >> >Therefore, the underlying image file may not be put into an up-to-date >> >state by qemu-nbd. >> > >> > >> >I suggest trying the following instead of blockdev --flushbufs: >> > >> > python -c 'import os; os.fsync(open("/dev/loopX", "r+b"))' >> > >> >This should do the same as blockdev --flushbufs *plus* it sends and >> >waits for the NBD FLUSH command. >> > >> >You may have to play with this command-line a little but the main idea >> >is to open the block device and fsync it. >> > >> >Stefan >> > >> >> Hi Stefan, >> >> One of my early experiments was adding a command line option to >>'qemu-nbd' that did an open on 'device' (similar to the -c option), and >>then calling 'fsync' on the 'device'. By itself, I did not get a >>complete flush to disk. Was I missing something? >> >> Empirically, the signal solution (blockdev --flushbufs plus >>'bdrv_flush_all') was keeping my disk consistent. My unit test >>exercises the flush and snapshot pretty rigorously; that is, it never >>passed before with 'qemu-nbd --cache=writeback ...'. However, I did not >>want to rely on 'sleep' for the race condition. >> >> Is there any opportunity with the nbd client socket interface? The >>advantage for me there is not modifying 'qemu-nbd' source. > >I'm suggesting that you don't need to modify qemu-nbd. If your host is >running nbd.ko with flush support, then it should be enough to open the >device and issue fsync(2). > >You can verify this using tcpdump(8) and checking that the NBD FLUSH >command is really being sent by the host kernel. If not, double check >you're using the latest nbd.ko. > >Stefan Stefan, I tried the 'fsync' approach. It apparently has no effect with my 3.3.1 Linux kernel and patch. Changing kernels is not an option for me at the moment, so I will revisit when we have an opportunity to upgrade kernels, but for the moment I'll have to stick with 'cache=writethrough'. Thank you again for your attention and help. Best Regards, Mark T.