[Qemu-discuss] A problem of VM with nbd network disk.

Caizhifeng Tue, 18 Feb 2014 23:37:24 -0800

Hello,

I've been tring to use QEMU-1.5.0 and Libvirt-1.1.0 to run a VM with a nbd 
network disk, the VM is created by Libvirt as follow(Please attention to the 
red words):


root@xxx-11:~# ps -ef | grep kvm
root      1647     2  0 Feb05 ?        00:00:00 [kvm-irqfd-clean]
root      8304     1 37 14:48 ?        00:00:03 /usr/bin/kvm -name wfg-vm -S 
-machine pc-i440fx-1.5,accel=kvm,usb=off,system=windows -m 1024 -realtime 
mlock=0 -smp 1,maxcpus=4,sockets=4,cores=1,threads=1 -uuid 
5c99f1f4-6d50-4e4e-b2f6-000750a5a154 -no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/wfg-vm.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc 
base=localtime,clock=vm,driftfix=slew -no-hpet -no-shutdown -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
usb-ehci,id=ehci,bus=pci.0,addr=0x4 -device 
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive 
file=/vms/images/wfg-vm,if=none,id=drive-ide0-0-0,format=qcow2,cache=directsync 
-device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive 
file=nbd:192.168.0.188:20002,if=none,id=drive-ide0-0-1,readonly=on -device 
ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=1 -drive 
if=none,id=drive-ide0-1-0,readonly=on,format=raw -device 
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev 
tap,fd=21,id=hostnet0 -device 
rtl8139,netdev=hostnet0,id=net0,mac=0c:da:41:1d:98:62,bus=pci.0,addr=0x3 
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
-chardev 
socket,id=charchannel0,path=/var/lib/libvirt/qemu/wfg-vm.agent,server,nowait 
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
 -device usb-tablet,id=input0,bus=usb.0 -vnc 0.0.0.0:0 -device 
VGA,id=video0,bus=pci.0,addr=0x2 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6


If the nbd server is running correctely, the VM will start successfully, is 
seems every think is OK.
But, if the server is crashed or the network become unreachable, then comes the 
problem, the VM's process will keep writing log to file /var/log/libvirt/qemu/ 
wfg-vm.log all the time, and make the disk full finally!

The log is as follow:
2014-02-18 03:25:04.702+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin 
QEMU_AUDIO_DRV=none /usr/bin/kvm -name wfg-vm -S -machine 
pc-i440fx-1.5,accel=kvm,usb=off,system=windows -m 1024 -realtime mlock=0 -smp 
1,maxcpus=4,sockets=4,cores=1,threads=1 -uuid 
5c99f1f4-6d50-4e4e-b2f6-000750a5a154 -no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/wfg-vm.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc 
base=localtime,clock=vm,driftfix=slew -no-hpet -no-shutdown -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
usb-ehci,id=ehci,bus=pci.0,addr=0x4 -device 
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive 
file=/vms/images/wfg-vm,if=none,id=drive-ide0-0-0,format=qcow2,cache=directsync 
-device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 
-drive file=nbd:192.168.0.190:10809,if=none,id=drive-ide0-0-1,readonly=on 
-device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1,bootindex=2 
-drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device 
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev 
tap,fd=23,id=hostnet0 -device 
rtl8139,netdev=hostnet0,id=net0,mac=0c:da:41:1d:98:62,bus=pci.0,addr=0x3 
-chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 
-chardev 
socket,id=charchannel0,path=/var/lib/libvirt/qemu/wfg-vm.agent,server,nowait 
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
 -device usb-tablet,id=input0,bus=usb.0 -vnc 0.0.0.0:0 -device 
VGA,id=video0,bus=pci.0,addr=0x2 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
Domain id=4 is tainted: high-privileges
char device redirected to /dev/pts/6 (label charserial0)
nbd.c:nbd_receive_reply():L746: read failed, ret(0) diff from size(16)
nbd.c:nbd_receive_reply():L746: read failed, ret(0) diff from size(16)
nbd.c:nbd_receive_reply():L746: read failed, ret(0) diff from size(16)
nbd.c:nbd_receive_reply():L746: read failed, ret(0) diff from size(16)
nbd.c:nbd_receive_reply():L746: read failed, ret(0) diff from size(16)
.......


I have review the context of the function nbd_receive_reply, and found that 
function qemu_recv return -1 with errno 104 when the nbd server down, but it 
seems the upper caller nbd_reply_ready dose not take care this situation and 
return noting, the upper caller does not aware the ERROR, and request comes 
continue and keep logging ........

Here is a piece of code:

static void nbd_reply_ready(void *opaque)
{
    BDRVNBDState *s = opaque;
    uint64_t i;
    int ret;

    if (s->reply.handle == 0) {
        /* No reply already in flight.  Fetch a header.  It is possible
         * that another thread has done the same thing in parallel, so
         * the socket is not readable anymore.
         */
        ret = nbd_receive_reply(s->sock, &s->reply);
        if (ret == -EAGAIN) {
            return;
        }
        if (ret < 0) {       ////in case of nbd server down, ret value is -104; 
I think in this case, there should be a way to deliver the ERROR to the upper 
level, and the s->sock should be closed, such as call nbd_teardown_connection
            s->reply.handle = 0;
            goto fail;
        }
}


Thank you in advance.

caizhifeng...@163.com<mailto:caizhifeng...@163.com>


-------------------------------------------------------------------------------------------------------------------------------------
????????????????????????????????????????
????????????????????????????????????????
????????????????????????????????????????
???
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!

[Qemu-discuss] A problem of VM with nbd network disk.

Reply via email to