Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
Hi, Using O_SYNC for disk image access is not acceptable: QEMU relies on the host OS to ensure that the data is written correctly. Even the current 'fsync' support is questionnable to say the least ! Please don't mix issues regarding QEMU disk handling and the underlying hypervisor/host OS block device handling. Regards, Fabrice. Rik van Riel wrote: This is the simple approach to making sure that disk writes actually hit disk before we tell the guest OS that IO has completed. Thanks to DMA_MULTI_THREAD the performance still seems to be adequate. A fancier solution would be to make the sync/non-sync behaviour of the qemu disk backing store tunable from the guest OS, by tuning the IDE disk write cache on/off with hdparm, and having hw/ide.c call ->fsync functions in the block backends. I'm willing to code up the fancy solution if people prefer that. Make sure disk writes really made it to disk before we report I/O completion to the guest domain. The DMA_MULTI_THREAD functionality from the qemu-dm IDE emulation should make the performance overhead of synchronous writes bearable, or at least comparable to native hardware. Signed-off-by: Rik van Riel <[EMAIL PROTECTED]> --- xen-unstable-10712/tools/ioemu/block-bochs.c.osync 2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block-bochs.c2006-07-28 02:21:08.0 -0400 @@ -91,7 +91,7 @@ int fd, i; struct bochs_header bochs; -fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); +fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block.c.osync2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block.c 2006-07-28 02:19:27.0 -0400 @@ -677,7 +677,7 @@ int rv; #endif -fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); +fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-cloop.c.osync 2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block-cloop.c2006-07-28 02:17:13.0 -0400 @@ -55,7 +55,7 @@ BDRVCloopState *s = bs->opaque; uint32_t offsets_size,max_compressed_block_size=1,i; -s->fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); +s->fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE | O_SYNC); if (s->fd < 0) return -1; bs->read_only = 1; --- xen-unstable-10712/tools/ioemu/block-cow.c.osync2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block-cow.c 2006-07-28 02:21:34.0 -0400 @@ -69,7 +69,7 @@ struct cow_header_v2 cow_header; int64_t size; -fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); +fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-qcow.c.osync 2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block-qcow.c 2006-07-28 02:20:05.0 -0400 @@ -95,7 +95,7 @@ int fd, len, i, shift; QCowHeader header; -fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); +fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-vmdk.c.osync 2006-07-28 02:15:56.0 -0400 +++ xen-unstable-10712/tools/ioemu/block-vmdk.c 2006-07-28 02:20:20.0 -0400 @@ -96,7 +96,7 @@ uint32_t magic; int l1_size; -fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); +fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
Fabrice Bellard wrote: Hi, Using O_SYNC for disk image access is not acceptable: QEMU relies on the host OS to ensure that the data is written correctly. This means that write ordering is not preserved, and on a power failure any data written by qemu (or Xen fully virt) guests may not be preserved. Applications running on the host can count on fsync doing the right thing, meaning that if they call fsync, the data *will* have made it to disk. Applications running inside a guest have no guarantees that their data is actually going to make it anywhere when fsync returns... This may look like hair splitting, but so far I've lost a (test) postgresql database to this 3 times already. Not getting the guest application's data to disk when the application calls fsync is a recipe for disaster. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
On Saturday 29 July 2006 15:59, Rik van Riel wrote: > Fabrice Bellard wrote: > > Hi, > > > > Using O_SYNC for disk image access is not acceptable: QEMU relies on the > > host OS to ensure that the data is written correctly. > > This means that write ordering is not preserved, and on a power > failure any data written by qemu (or Xen fully virt) guests may > not be preserved. I might be willing to accept this (or similar) patch if you made it conditional on the guest having disabled write caching. I agree with Fabrice that the performance impact it too severe to consider turning it on by default. The same problems occurs with many hardware RAID controllers, and even many harddrives: fsync() only guarantees that the data has been passed to the controller (in this case the host OS). If you need absolute reliability you either need more flusing in your guest OS, disable the write cache, or battery backup to make sure the IDE hardware (ie. host OS) doesn't die unexpectedly. Paul ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
Paul Brook wrote: On Saturday 29 July 2006 15:59, Rik van Riel wrote: Fabrice Bellard wrote: Hi, Using O_SYNC for disk image access is not acceptable: QEMU relies on the host OS to ensure that the data is written correctly. This means that write ordering is not preserved, and on a power failure any data written by qemu (or Xen fully virt) guests may not be preserved. I might be willing to accept this (or similar) patch if you made it conditional on the guest having disabled write caching. I agree with Fabrice that the performance impact it too severe to consider turning it on by default. Easy to do with the fsync infrastructure, but probably not worth doing since people are working on the AIO I/O backend, which would allow multiple outstanding writes from a guest. That, in turn, means I/O completion in the guest can be done when the data really hits disk, but without a performance impact. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
> Easy to do with the fsync infrastructure, but probably not worth > doing since people are working on the AIO I/O backend, which would > allow multiple outstanding writes from a guest. That, in turn, > means I/O completion in the guest can be done when the data really > hits disk, but without a performance impact. Not entirely true. That only works if you allow multiple guest IO requests in parallel, ie. some form of tagged command queueing. This requires either improving the SCSI emulation, or implementing SATA emulation. AFAIK parallel IDE doesn't support command queueing. My impression what that the initial AIO implementation is just straight serial async operation. IO wouldn't actually go any faster, it just means the guest can do something else while it's waiting. Paul ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk
How about compromising, and making the patch a run time option. Presumably this is only a problem when the virtual machine is not properly shutdown. For those ho want the extra security of knowing the data will be written regardless of the shutdown status they can enable the flag. By default it could be turned off. Then everybody can be happy. BillOn 7/29/06, Rik van Riel <[EMAIL PROTECTED]> wrote: Fabrice Bellard wrote:> Hi,>> Using O_SYNC for disk image access is not acceptable: QEMU relies on the> host OS to ensure that the data is written correctly.This means that write ordering is not preserved, and on a power failure any data written by qemu (or Xen fully virt) guests maynot be preserved.Applications running on the host can count on fsync doing theright thing, meaning that if they call fsync, the data *will* have made it to disk. Applications running inside a guest haveno guarantees that their data is actually going to make itanywhere when fsync returns...This may look like hair splitting, but so far I've lost a (test) postgresql database to this 3 times already. Not gettingthe guest application's data to disk when the application callsfsync is a recipe for disaster.--"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are,by definition, not smart enough to debug it." - Brian W. Kernighan___Qemu-devel mailing list Qemu-devel@nongnu.orghttp://lists.nongnu.org/mailman/listinfo/qemu-devel ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Accelerator feature
QEMU Accelerator license is not ethical. Open the code unless you have something to hide. LLama Gratis a cualquier PC del Mundo.Llamadas a fijos y móviles desde 1 céntimo por minuto.http://es.voice.yahoo.com___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Accelerator feature
On Sat, 29 Jul 2006, Karlos . wrote: QEMU Accelerator license is not ethical. Open the code unless you have something to hide. Karlos, do you think it's ethical to attack Fabrice for his kernel module license choice? Please consider what you've done for Qemu yourself, what Qemu user/developers community gained from your work on Qemu. If you have not provided anything, then please just think about it next time you're going to attack somebody who provided a lot. Thanks, Karel -- Karel Gardas [EMAIL PROTECTED] ObjectSecurity Ltd. http://www.objectsecurity.com ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Accelerator feature
Don't use it if you don't like the license... the author has the right to license it how he wants. Or... you can develop your own version, or use/contribute to the open source alternative, qvm86. - Leo Reiter Karlos . wrote: > > QEMU Accelerator license is not ethical. > > Open the code unless you have something to hide. > > -- Leonardo E. Reiter Vice President of Product Development, CTO Win4Lin, Inc. Virtual Computing that means Business Main: +1 512 339 7979 Fax: +1 512 532 6501 http://www.win4lin.com ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Performance issues with -usb
Brad, I think the 0x20 (32ms) value might be fine for something like VNC display, but on a local display the mouse response is noticeably slow (for me at least.) I tried using 0x0a (10), same as the plain USB mouse, but the idle CPU utilization was still around 10%. What is really odd is that this was not a problem in older versions of QEMU, unless I'm reading data incorrectly. Do you know what else might have changed? I will look as well and see if I can apply some logic to this. I suspect it's probably something with the new clock mechanism, but it still doesn't make a lot of sense to me. I've looked at both Windows 2000 and XP guests and seen similar results so far with the latest QEMU/KQEMU combination. The good news at least is that the 0x0a interval seems very smooth even on a local display, but certainly is much friendlier than the CVS version on the CPU. Still, 10% idle usage is quite high. If it can't logically be fixed to provide both a) low idle CPU and b) smooth local display response, then I might just post a runtime option patch allowing the user to configure the interval on the command line. I'm not sure how correct that is, but it would be useful to me at least. - Leo Reiter Brad Campbell wrote: > Lonnie Mendez wrote: > >>Perhaps tweaking the value of ep_bInterval for the tablet's status >> change endpoint would help? The endpoint descriptor for the tablet >> currently has this at 3 milliseconds. The hid mouse reports a 10 >> millisecond polling interval. > > Indeed. I'm not quite sure how or why I did that in the 1st place as the > tablet started life as a copy of the mouse in any case. > > I've had good drag through the specs and all the data sheets for mouse > chips I could find out there and most of them seem to recommend a value > no faster than 8ms. > > This drops the cpu utilisation of a Windows guest while idle about 75% > when using -usbdevice tablet > I've not noticed any change in usability or mouse responsiveness. > (I played with values up to 0xFF but after about 0x20 there seemed to be > immeasurable/no difference) > -- Leonardo E. Reiter Vice President of Product Development, CTO Win4Lin, Inc. Virtual Computing that means Business Main: +1 512 339 7979 Fax: +1 512 532 6501 http://www.win4lin.com ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] qemu cpu-exec.c
CVSROOT:/sources/qemu Module name:qemu Changes by: Paul Brook 06/07/29 19:09:31 Modified files: . : cpu-exec.c Log message: Arm host build fix. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/cpu-exec.c?cvsroot=qemu&r1=1.83&r2=1.84 ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Performance issues with -usb
On Sat, 29 Jul 2006, Leonardo E. Reiter wrote: Brad, I think the 0x20 (32ms) value might be fine for something like VNC display, but on a local display the mouse response is noticeably slow (for me at least.) I tried using 0x0a (10), same as the plain USB mouse, but the idle CPU utilization was still around 10%. What is really odd is that this was not a problem in older versions of QEMU, unless I'm reading data incorrectly. Do you know what else might have changed? I will look as well and see if I can apply some logic to this. I suspect it's probably something with the new clock mechanism, but it still doesn't make a lot of sense to me. Sometime after usbtablet hit the CVS i tried it out and the first thing i noticed was the cpu load. On june 12 i tried to figure out if one of the patch authors (Anthony Liguori) is aware of the situation. So it's not related to gettimeofday switchover. -- mailto:[EMAIL PROTECTED] ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Accelerator feature
To be honest, I'd be willing to PAY Fabrice for QEMU. What else can I use to boot windows 2k on my sun blade 2000 at work? jonathan On 7/29/06, Karel Gardas <[EMAIL PROTECTED]> wrote: On Sat, 29 Jul 2006, Karlos . wrote: > > QEMU Accelerator license is not ethical. > > Open the code unless you have something to hide. Karlos, do you think it's ethical to attack Fabrice for his kernel module license choice? Please consider what you've done for Qemu yourself, what Qemu user/developers community gained from your work on Qemu. If you have not provided anything, then please just think about it next time you're going to attack somebody who provided a lot. Thanks, Karel -- Karel Gardas [EMAIL PROTECTED] ObjectSecurity Ltd. http://www.objectsecurity.com ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel -- -- Jonathan Kalbfeld +1 323 620 6682 ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Accelerator feature
On Sat, 2006-07-29 at 14:56 -0700, Jonathan Kalbfeld wrote: > To be honest, I'd be willing to PAY Fabrice for QEMU. > > What else can I use to boot windows 2k on my sun blade 2000 at work? > > jonathan I have never seen anyone say why KQEMU is closed, though, at least beyond any speculation. And I've searched the mailing list archives at Gmane, reading the long thread from when KQEMU was new and the various threads since then. I certainly respect that as the author of the code Fabrice has made that choice, but I'd just be interested to know why. QVM86 was a viable alternative in my mind, at least until KQEMU got kernel-mode virtualization. Oh, well. Andrew ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Re: Accelerator feature
Please don't feed the trolls :-) Best thing to do is just ignore messages like this. Regards, Anthony Liguori On Sat, 29 Jul 2006 14:13:44 -0400, Leonardo E. Reiter wrote: > Don't use it if you don't like the license... the author has the right to > license it how he wants. Or... you can develop your own version, or > use/contribute to the open source alternative, qvm86. > > - Leo Reiter > > Karlos . wrote: >> >> QEMU Accelerator license is not ethical. >> >> Open the code unless you have something to hide. >> >> ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel