Hi All, I'd like to accomplish 2 things with this message: 1) Unblock (one way or another) https://review.openstack.org/#/c/123957 2) Create some form of consensus on when it's okay to add temporary code to nova to work around bugs in external utilities.
So some background on this specific issue. The issue was first reported in July 2014 at [1] and then clarified at [2]. The synopsis of the bug is that calling qemu-img convert -O raw /may/ generate a corrupt output file if the source image isn't fully flushed to disk. The coreutils folk discovered something similar in 2011 *sigh* The clear and correct solution is to ensure that qemu-img uses FIEMAP_FLAG_SYNC. This in turn produces a measurable slowdown in that code path, so additionally it's best if qemu-img uses an alternate method to determine data status in a disk image. This has been done and will be included in qemu 2.2.0 when it's released. These fixes prompted a more substantial rework of that code in qemu. Which is awesome but not *required* to fix the bug in qemu. While we wait for $distros to get the fixed qemu nova is still vulnerable to the bug. To that end I proposed a work around in nova that forces images retrieved from glance to disk with an fsync() prior to calling qemu-img on them. I admit that this is ugly and has a performance impact. In order to reduce the impact of the fsync() I considered: 1) Testing the qemu version and only fsync()ing on affected versions. - Vendors will backport the fix to there version of qemu. The fixed version will still claim to be 2.1.0 (for example) and therefore trigger the fsync() when not required. Given how unreliable this will be I dismissed it as an option 2) API Change - In the case of this specific bug we only need to fsync() in certain scenarios. It would be easy to add a flag to IMAGE_API.download() to determine if this fsync() is required. This has the nice property of only having a performance impact in the suspect case (personally I'll take slow-and-correct over fast-and-buggy any day). My hesitation is that after we've modified the API it's very hard to remove that change when we decide the work around is redundant. 3) Config file option - For many of the same reasons as the API change this seemed like a bad idea. Does anyone have any other ideas? One thing that I haven't done is measure the impact of the fsync() on any reasonable workload. This is mainly because I don't really know how. Sure I could do some statistics in devstack but I don't really think they'd be meaningful. Also the size of the image in glance is fairly important. An fsync() of an 100Gb image is many times more painful than an 1Gb image. While in Paris I was asked to look at other code paths in nova where we use qemu-img convert. I'm doing this analysis. To date I have some suspicions that snapshot (and migration) are affected, but no data that confirms or debases that. I continue to look at the appropriate code in nova, libvirt and qemu. I understand that there is more work to be done in this area, and I'm happy to do it. Having said that from where I sit that work is not directly related to the bug that started this. As the idea is to remove this code as soon as all the distros we care about have a fixed qemu I started an albeit brief discussion here[3] on which distros are in that list. Armed with that list I have opened (or am in the process of opening) bugs for each version of each distribution to make them aware of the issue and the fix. I have a status page at [4]. okay I think I'm done raving. So moving forward: 1) So what should I do with the open review? 2) What can we learn from this in terms of how we work around key utilities that are not in our direct power to change. - Is taking ugly code for "some time" okay? I understand that this is a complex issue as we're relying on $developer to be around (or leave enough information for those that follow) to determine when it's okay to remove the ugliness. If you made it this far bravo! Yours Tony. [1] https://bugs.launchpad.net/nova/+bug/1350766 [2] https://bugs.launchpad.net/qemu/+bug/1368815 [3] http://lists.openstack.org/pipermail/openstack-dev/2014-November/050526.html [4] https://wiki.openstack.org/wiki/Bug1368815
pgpd1QE1DlW2W.pgp
Description: PGP signature
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev