On Fri, Nov 1, 2019 at 10:34 AM Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Fri, Nov 01, 2019 at 08:14:08AM +0100, Christian Ehrhardt wrote: > > Hi everyone, > > we've got a bug report recently - on handling qemu .so's through > > upgrades - that got me wondering how to best handle it. > > After checking with Paolo yesterday that there is no obvious solution > > that I missed we agreed this should be brought up on the list for > > wider discussion. > > Maybe there already is a good best practise out there, or if it > > doesn't exist we might want to agree upon one going forward. > > Let me outline the case and the ideas brought up so far. > > > > Case > > - You have qemu representing a Guest > > - Due to other constraints e.g. PT you can't live migrate (which would > > be preferred) > > - You haven't used a specific shared object yet - lets say RBD storage > > driver as example > > - Qemu gets an update, packaging replaces the .so files on disk > > - The Qemu process and the .so files on disk now have a mismatch in $buildid > > - If you hotplug an RBD device it will fail to load the (now new) .so > > What happens when it fails to load ? Does the user get a graceful > error message or does QEMU abort ? I'd hope the former. >
It is fortunately a graceful error message, here an example: $ virsh attach-device lateload curldisk.xml Reported issue happens on attach: root@b:~# virsh attach-device lateload cdrom-curl.xml error: Failed to attach device from cdrom-curl.xml error: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk2' In the qemu output log we can see: Failed to initialize module: /usr/lib/x86_64-linux-gnu/qemu/block-curl.so Note: only modules from the same build can be loaded. > > > > On almost any other service than "qemu representing a VM" the answer > > is "restart it", some even re-exec in place to keep things up and > > running. > > > > Ideas so far: > > a) Modules are checked by build-id, so keep them in a per build-id dir on > > disk > > - qemu could be made looking preferred in -$buildid dir first > > - do not remove the packages with .so's on upgrades > > - needs a not-too-complex way to detect which buildids running qemu > > processes > > have for packaging to be able to "autoclean later" > > - Needs some dependency juggling for Distro packaging but IMHO can be made > > to work if above simple "probing buildid of running qemu" would exist > > So this needs a bunch of special QEMU hacks in package mgmt tools > to prevent the package upgrade & cleanup later. This does not look > like a viable strategy to me. > > > > > b) Preload the modules before upgrade > > - One could load the .so files before upgrade > > - The open file reference will keep the content around even with the > > on disk file gone > > - lacking a 'load-module' command that would require fake hotplugs > > which seems wrong > > - Required additional upgrade pre-planning > > - kills most benefits of modular code without an actual need for it > > being loaded > > Well there's two benefits to modular approach > > - Allow a single build to be selectively installed on a host or container > image, such that the install disk footprint is reduced > - Allow a faster startup such that huge RBD libraries dont slow down > startup of VMs not using RBD disks. > > Preloading the modules before upgrade doesn't have to the second benefit. > We just have to make sure the pre loading doesn't impact the VM startup > performance. I haven't looked at it that way yet and somewhat neglected former suggestions of such a command. I thought there might be concerns about "amount of loaded code", but it shouldn't be "active" unless we really have a device of that kind right? You are right, it seems it won't "loose much" by loading all of them late. > IOW, register a SIGUSR2 handler which preloads all modules it finds on > disk. Have a pre-uninstall option on the .so package that sends SIGUSR2 > to all QEMU processes. The challenge of course is that signals are > async. If there would be something simple as log line people could check on that to ensure the async loading is done. Not perfectly synchronous, but maybe useful if a new QMP is considered too heavy. > You might suggest a QMP command, but only 1 process can have the > QMP monitor open at any time and that's libvirt. Adding a second QMP > monitor instance is possible but kind of gross for this purpose. This (hopefully) already is a corner case. I think admins would be ok with `virsh qemu-monitor-command` or such. No need for a second QMP monitor IMHO. > Another option would be to pre-load the modules during startup, but > do it asynchronously, so that its not blocking overall VM startup. > eg just before starting the mainloop, spawn a background thread to > load all remaining modules. > > This will potentially degrade performance of the guest CPUs a bit, > but avoids the latency spike from being synchronous in the startup > path. As above, I think it could be ok to load them even later than main loop as long as there is e.g. a reliable entry people can check. But this comes close to a QMP "are things loaded" command and in this case the synchronous "load-all-you-find" command seems to make more sense. > > > c) go back to non modular build > > - No :-) > > > > d) anything else out there? > > e) Don't do upgrades on a host with running VMs :-) > > Upgrades can break the running VM even ignoring this particular > QEMU module scenario. Which is true and I think clear in general - I'd even assume it is a general guidance for almost all admins. But we all know that things never end up so perfect, which is why I directly started with the example case of a guest that can neither migrate away nor be restarted. > f) Simply document that if you upgrade with running VMs that some > features like hotplug of RBD will become unavailable. Users can > then avoid upgrades if that matters to them. That is similar like your above approach. It is absolutely valid and a good best practise policy, but the question is how could we help out people that are locked in and still want to avoid that. Your suggestion of a sync "load-all-modules" command could be a way out for people where policies are not, lets see what other people think of it. > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > -- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd