> -----Original Message----- > From: Stefan Hajnoczi [mailto:stefa...@redhat.com] > Sent: Monday, July 14, 2014 4:43 PM > To: Wangkai (Kevin,C) > Cc: qemu-devel@nongnu.org; aligu...@amazon.com; Lee yang > Subject: Re: [PATCH] Tap: fix vcpu long time io blocking on tap > > On Mon, Jul 14, 2014 at 01:55:05AM +0000, Wangkai (Kevin,C) wrote: > > > > > > > -----Original Message----- > > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com] > > > Sent: Friday, July 11, 2014 9:04 PM > > > To: Wangkai (Kevin,C) > > > Cc: qemu-devel@nongnu.org; aligu...@amazon.com; Lee yang > > > Subject: Re: [PATCH] Tap: fix vcpu long time io blocking on tap > > > > > > On Fri, Jul 11, 2014 at 01:05:30AM +0000, Wangkai (Kevin,C) wrote: > > > > When used a tap as net driver for vm, if too many packets was > > > > delivered to the guest os via tap interface, the guest os will be > > > > blocked on io events for a long time, while tap driver was > busying > > > process packets. > > > > > > > > kvm vcpu thread block on io lock call trace: > > > > __lll_lock_wait > > > > _L_lock_1004 > > > > __pthread_mutex_lock > > > > qemu_mutex_lock > > > > kvm_cpu_exec > > > > qemu_kvm_cpu_thread_fn > > > > start_thread > > > > > > > > qemu io thread call trace: > > > > ... > > > > qemu_net_queue_send > > > > tap_send > > > > qemu_iohandler_poll > > > > main_loop_wait > > > > main_loop > > > > > > > > > > > > I think the qemu io lock time should be as small as possible, and > > > > the io work slice should be limited at a particular ration or > time. > > > > > > > > --- > > > > Signed-off-by: Wangkai <wangka...@huawei.com> > > > > > > How many packets are you seeing in a single tap_send() call? > > > > > > Have you profiled the tap_send() code path? Maybe it is performing > > > some operation that is very slow. > > > > > > By the way, if you want good performance you should use vhost_net > > > instead of userspace vhost_net. Userspace virtio-net is not very > > > optimized. > > > > > > Stefan > > > > > > Hi Stefan, > > > > I am not use profile, just debug with gdb and code review. > > It's worth understanding the root cause for this behavior because > something is probably wrong. > > > When packets delivered, I found the VM was hung, and I check qemu run > > > > State by gdb, I see the call trace for IO thread and vcpu thread, and > > > > I add debug info to check how many packets within tap_send, the info > below: > > > > total recv 393520 time 1539821 us > > total recv 1270 time 4931 us > > total recv 257872 time 995828 us > > total recv 10745 time 41438 us > > total recv 505387 time 2000925 us > > 505387 packets or 505387 bytes? > > If that's packets, then even with small 64-byte packets that would mean > 32 MB of pending data! > > Are you running a networking benchmark where you'd expect lots of > packets? > > Have you checked how lot the time between tap_send() calls is? Perhaps > something else in QEMU is blocking the event loop so packets accumulate > in the host kernel. > > Stefan
Hi Stefan, Here the detail network: +--------------------------------------------+ | The host add tap1 and eth10 to bridge 'br1'| +--------+ | +------------+ | | send | | | VM eth1-+-tap1 --- bridge --- eth10 --+---------------------+ packets| | +------------+ | | | +--------------------------------------------+ +--------+ Qemu start vm by virtio, use tap interface, option is: -net nic,vlan=101, model=virtio -net tap,vlan=101,ifname=tap1,script=no,downscript=no And add tap1 and eth10 to bridge br1 in the host: Brctl addif br1 tap1 Brctl addif br1 eth10 total recv 505387 time 2000925 us: means call tap_send once dealing 505387 packets, the packet payload was 300 bytes, and time use for tap_send() was 2,000,925 micro-seconds, time was measured by record time stamp at function tap_send() start and end. We just test the performance of VM. Regards Wangkai