Near the top of hw/virtio.c we have this: /* QEMU doesn't strictly need write barriers since everything runs in * lock-step. We'll leave the calls to wmb() in though to make it obvious for * KVM or if kqemu gets SMP support. * In any case, we must prevent the compiler from reordering the code. * TODO: we likely need some rmb()/mb() as well. */
#define wmb() __asm__ __volatile__("": : :"memory") However, as far as I can tell when using both kvm and io-thread, the assertion that barriers aren't necessary just isn't true. Although it probably works on x86 with its strongly ordered model. We think we've hit a race due to this with kvm on POWER, described in the forwarded message below. Adding a barrier fixes the problem. Below we just stuck in a powerpc specific barrier which was useful as proof of concept, but I'm not sure how to go about putting a proper, architecture-appropriate barrier into this code. ----- Forwarded message from Alexey Kardashevskiy <a...@ozlabs.ru> ----- Date: Thu, 25 Aug 2011 16:45:31 +1000 From: Alexey Kardashevskiy <a...@ozlabs.ru> To: David Gibson <d...@au1.ibm.com> Subject: memory barrier problem Hi! Recently I got a bugreport about virtio on power-kvm. The ping command causes fatal virtio-ring error: virtio_net virtio0: output:id 138 is not a head! Below is how I reproduce the issue: 1. configure QEMU: ./configure --target-list=ppc64-softmmu --enable-fdt --enable-io-thread \ --enable-attr --enable-iommu --enable-kvm '--cc=gcc -m64' --enable-debug 2. compile it and run: qemu-impreza/ppc64-softmmu/qemu-system-ppc64 \ -enable-kvm -M pseries -m 1024 -nographic \ -mem-path /var/lib/hugetlbfs/global/pagesize-16MB/ \ -net nic,model=virtio-net \ -net user,hostfwd=tcp::6555-:22 \ -kernel /home/aik/guest.vmlinux.n \ -append "console=hvc0 debug" \ -initrd /home/aik/fs_large.aik.cpio 3. in the guest, I configure the single interface with "dhclient". The "eth0" receives an IP from 10.0.2.2 (QEMUs DHCP server address) 4. in the guest, I run ping as below (i.e. I ping QEMU): ping -f -s 65507 10.0.2.2 After a while (5 to 300 seconds), the message from drivers/virtio/virtio_ring.c:326 appears. This one: BAD_RING(vq, "id %u is not a head!\n", i); The analysis shows that it happens when the guest driver is trying to get/release a buffer from the "used" list - it is a ring buffer with a free-running counter. The QEMU first adds a new element to a ring buffer and then updates the counter (this is correct) but in the guest handler it is possible sometime that the counter is updated while the ring buffer element is still old. To confirm this, I added a loop to re-read the ring buffer element until it changes to the expected value. David suggested to have a look at wmb() implementation on the QEMU side, and it is: #define wmb() __asm__ __volatile__("": : :"memory") When I replace it with the line below, everything works just fine: #define wmb() __asm__ __volatile__("eieio": : :"memory") Host/guest kernels and QEMU trees versions do not matter much as virtio rings have not been changed this year I believe. Hardware is a POWER7 machine. The question is if it is a correct fix or we need something better, don't we? -- Alexey ----- End forwarded message ----- -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson