Hi, Thank you for getting back!
I'm trying to follow you, but I don't understand all the details. I would like to ask this question though: What is the difference between v2.8.0 and this commit? With v2.8.0 the same qemu command worked, but I admit it doesn't request sharing. We also use libvirt v1.3.4, which might be a problem, but at least we want to understand if the commit in question introduced an obvious problem or if it's all in the details. Btw, the qemu command generated by libvirt is this one, sorry about that: 2017-03-31 17:40:10.956+0000: starting up libvirt version: 1.3.4, package: 0+amos3~u16.04 (Enea Armband Devops Team <armb...@enea.com> Fri, 13 Jan 2017 02:06:05 +0100), qemu version: 2.8.50(Debian 1:2.9+amos2~u16.04), hostname: node-2.domain.tld LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -name instance-00000076,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-14-instance-00000076/master-key.aes -machine virt-2.8,accel=kvm,usb=off,gic-version=3 -cpu host -m 256 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 2812f3c9-f564-499b-a8c7-e9e7ccf24143 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-14-instance-00000076/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-shutdown -boot strict=on -kernel /var/lib/nova/instances/2812f3c9-f564-499b-a8c7-e9e7ccf24143/kernel -initrd /var/lib/nova/instances/2812f3c9-f564-499b-a8c7-e9e7ccf24143/ramdisk -append 'root=/dev/vda1 rw rootwait console=tty0 console=ttyS0 console=ttyAMA0' -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1 -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x0 -usb -drive file=/var/lib/nova/instances/2812f3c9-f564-499b-a8c7-e9e7ccf24143/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-device,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/nova/instances/2812f3c9-f564-499b-a8c7-e9e7ccf24143/disk.config,format=raw,if=none,id=drive-virtio-disk1,cache=none,aio=native -device virtio-blk-device,scsi=off,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=27,id=hostnet0 -device virtio-net-device,netdev=hostnet0,id=net0,mac=fa:16:3e:82:0a:2b -serial file:/var/lib/nova/instances/2812f3c9-f564-499b-a8c7-e9e7ccf24143/console.log -serial pty -vnc 0.0.0.0:0 -k en-us -device VGA,id=video0,vgamem_mb=16,bus=pci.2,addr=0x1 -device virtio-balloon-device,id=balloon0 -msg timestamp=on Domain id=14 is tainted: high-privileges Regards, /Ciprian -----Original Message----- From: Max Reitz [mailto:mre...@redhat.com] Sent: Friday, March 31, 2017 8:43 PM To: Ciprian Barbu <ciprian.ba...@enea.com>; qemu-devel@nongnu.org; Eric Blake <ebl...@redhat.com>; Alexandru Avadanii <alexandru.avada...@enea.com> Cc: Jeff Cody <jc...@redhat.com>; Markus Armbruster <arm...@redhat.com>; svc-armband <armb...@enea.com>; Kevin Wolf <kw...@redhat.com> Subject: Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs On 31.03.2017 18:03, Ciprian Barbu wrote: > Hello, > > Similar to the other thread about possible regression with rbd, there might > be a regression with nbd. > This time we are launching an instance from an image (not volume) and try to > live migrate it: > > nova live-migration <test_instance> > > The nova-compute service complains with: > > 2017-03-31 15:32:56.179 7806 INFO nova.virt.libvirt.driver > [req-15d79cbe-5956-4738-92df-3624e6b993ee > d795de59fb9a4ea38776a11d20ae8469 cee03e74881f4ccba3b83345fb652b2c - - > -] [instance: 6a04508f-5d79-4582-8e2c-4cc368753f6c] Migration running > for 0 secs, memory 100% remaining; (bytes processed=0, remaining=0, > total=0) > 2017-03-31 15:32:58.029 7806 WARNING stevedore.named > [req-73bc0113-5555-4dd8-8903-d3540cc61b47 > b9fbceeadd2d4d1bab9c90ae104db1f7 7e7db99b32c6467184701e9a0c2f1de7 - - > -] Could not load instance_network_info > 2017-03-31 15:32:59.038 7806 ERROR nova.virt.libvirt.driver > [req-15d79cbe-5956-4738-92df-3624e6b993ee > d795de59fb9a4ea38776a11d20ae8469 cee03e74881f4ccba3b83345fb652b2c - - > -] [instance: 6a04508f-5d79-4582-8e2c-4cc368753f6c] Live Migration > failure: internal error: unable to execute QEMU command > 'nbd-server-add': Conflicts with use by drive-virtio-disk0 as 'root', > which does not allow 'write' on #block143 > 2017-03-31 15:32:59.190 7806 ERROR nova.virt.libvirt.driver > [req-15d79cbe-5956-4738-92df-3624e6b993ee > d795de59fb9a4ea38776a11d20ae8469 cee03e74881f4ccba3b83345fb652b2c - - > -] [instance: 6a04508f-5d79-4582-8e2c-4cc368753f6c] Migration > operation has aborted > > I will try and bisect it myself, but I thought I would paste this here first, > just so you know there is this issue too. Well, I already know the commit in question. It's 8a7ce4f9338c475df1afc12502af704e4300a3e0 ("nbd/server: Use real permissions for NBD exports"). Whether this is a bug depends on the standpoint. I would very much consider it a bug fix because as of this commit you can no longer create a writable NBD server on a block device that is in use by a guest device without the guest device being aware of this. The problem is that the functionality to "make" the guest device "aware" of it was introduced only a couple of commits before, and it's called "share-rw". So this doesn't work: $ x86_64-softmmu/qemu-system-x86_64 \ -blockdev node-name=image,driver=qcow2,\ file.driver=file,file.filename=foo.qcow2 \ -device virtio-blk,drive=image \ -qmp stdio {"QMP": {"version": {"qemu": {"micro": 92, "minor": 8, "major": 2}, "package": " (v2.8.0-2038-g6604c893d0)"}, "capabilities": []}} {'execute':'qmp_capabilities'} {"return": {}} {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}} {"return": {}} {'execute':'nbd-server-add','arguments':{'device':'image','writable':true}} {"error": {"class": "GenericError", "desc": "Conflicts with use by /machine/peripheral-anon/device[0]/virtio-backend as 'root', which does not allow 'write' on image"} But this works: $ x86_64-softmmu/qemu-system-x86_64 \ -blockdev node-name=image,driver=qcow2,\ file.driver=file,file.filename=foo.qcow2 \ -device virtio-blk,drive=image,share-rw=on \ -qmp stdio {"QMP": {"version": {"qemu": {"micro": 92, "minor": 8, "major": 2}, "package": " (v2.8.0-2038-g6604c893d0)"}, "capabilities": []}} {'execute':'qmp_capabilities'} {"return": {}} {'execute':'nbd-server-start','arguments':{'addr':{'type':'inet','data':{'host':'localhost','port':'10809'}}}} {"return": {}} {'execute':'nbd-server-add','arguments':{'device':'image','writable':true}} {"return": {}} (The difference is the share-rw=on in the -device parameter.) So in theory all that's necessary is to set share-rw=on for the device in the management layer. But I'm not sure whether that's practical. As for just allowing the NBD server write access to the device... To me that appears pretty difficult from an implementation perspective. We assert that nobody can write without having requested write access and we make sure that nobody can request write access without it being allowed. Making an exception for NBD seems very difficult and would probably mean we'd have to drop the assertion for write accesses altogether. Max