[Qemu-devel] [Bug 1823458] Re: race condition between vhost_net_stop and CHR_EVENT_CLOSED on shutdown crashes qemu

Dan Streetman Fri, 10 May 2019 07:47:15 -0700

On a Xenial DPDK setup with the proposed qemu version (1:2.5+dfsg-
5ubuntu10.37), I created a VM and attached a vhost-user interface to it
using this xml:


$ cat vm3-iface2.xml 
  <interface type='vhostuser'>
    <mac address='52:54:00:c3:37:7e'/>
    <source type='unix' path='/run/openvswitch/vhu2' mode='client'/>
    <model type='virtio'/>
    <driver name='vhost'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
  </interface>

the OVS interface was created with:
# ovs-vsctl add-port br1 vhu2 -- set Interface vhu2 type=dpdkvhostuser

The interface was added to the vm with:
$ virsh attach-device vm3 vm3-iface2.xml --live

and detached with:
$ virsh detach-device vm3 vm3-iface2.xml --live

Inside the guest, the vhost-user interface was configured with DHCP, and
a ping started to the DHCP server, 10.0.2.2:

ubuntu@vm3:~$ ping 10.0.2.2
PING 10.0.2.2 (10.0.2.2) 56(84) bytes of data.
64 bytes from 10.0.2.2: icmp_seq=1 ttl=255 time=0.122 ms
64 bytes from 10.0.2.2: icmp_seq=2 ttl=255 time=0.107 ms
64 bytes from 10.0.2.2: icmp_seq=3 ttl=255 time=0.112 ms
64 bytes from 10.0.2.2: icmp_seq=4 ttl=255 time=0.110 ms
>From 10.198.200.1 icmp_seq=5 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=6 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=7 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=8 Destination Port Unreachable
64 bytes from 10.0.2.2: icmp_seq=9 ttl=255 time=0.255 ms
>From 10.198.200.1 icmp_seq=10 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=11 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=12 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=13 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=14 Destination Port Unreachable
>From 10.198.200.1 icmp_seq=15 Destination Port Unreachable
64 bytes from 10.0.2.2: icmp_seq=16 ttl=255 time=0.277 ms
64 bytes from 10.0.2.2: icmp_seq=17 ttl=255 time=0.127 ms
64 bytes from 10.0.2.2: icmp_seq=18 ttl=255 time=0.104 ms


each change in ping (working, not working) corresponded to attaching or 
detaching the vhost-user interface; repeated attaches/detaches were made with 
no problems, and ping correctly working or not working based on if the 
interface was attached or not.

The guest was suspended, I waited for 5 seconds, and then resumed the
guest, with no problems.

The guest was shutdown with the interface attached with no problem or
crash; the guest was also shutdown with the interface detached (after
attaching and detaching several times) with no problem.

All this was repeated several times with no problems seen.

I believe this covers regression testing for the area of code the patch
touches, so marking this as fix committed again; this should be ready
for release.

** Changed in: qemu (Ubuntu Xenial)
       Status: Incomplete => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823458

Title:
  race condition between vhost_net_stop and CHR_EVENT_CLOSED on shutdown
  crashes qemu

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Fix Committed
Status in Ubuntu Cloud Archive ocata series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Trusty:
  Won't Fix
Status in qemu source package in Xenial:
  Fix Committed
Status in qemu source package in Bionic:
  Fix Released
Status in qemu source package in Cosmic:
  Fix Released
Status in qemu source package in Disco:
  Fix Released

Bug description:
  [impact]

  on shutdown of a guest, there is a race condition that results in qemu
  crashing instead of normally shutting down.  The bt looks similar to
  this (depending on the specific version of qemu, of course; this is
  taken from 2.5 version of qemu):

  (gdb) bt
  #0  __GI___pthread_mutex_lock (mutex=0x0) at ../nptl/pthread_mutex_lock.c:66
  #1  0x00005636c0bc4389 in qemu_mutex_lock (mutex=mutex@entry=0x0) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/util/qemu-thread-posix.c:73
  #2  0x00005636c0988130 in qemu_chr_fe_write_all (s=s@entry=0x0, 
buf=buf@entry=0x7ffe65c086a0 "\v", len=len@entry=20) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/qemu-char.c:205
  #3  0x00005636c08f3483 in vhost_user_write (msg=msg@entry=0x7ffe65c086a0, 
fds=fds@entry=0x0, fd_num=fd_num@entry=0, dev=0x5636c1bf6b70, 
dev=0x5636c1bf6b70)
      at /build/qemu-7I4i1R/qemu-2.5+dfsg/hw/virtio/vhost-user.c:195
  #4  0x00005636c08f411c in vhost_user_get_vring_base (dev=0x5636c1bf6b70, 
ring=0x7ffe65c087e0) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/virtio/vhost-user.c:364
  #5  0x00005636c08efff0 in vhost_virtqueue_stop (dev=dev@entry=0x5636c1bf6b70, 
vdev=vdev@entry=0x5636c2853338, vq=0x5636c1bf6d00, idx=1) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/virtio/vhost.c:895
  #6  0x00005636c08f2944 in vhost_dev_stop (hdev=hdev@entry=0x5636c1bf6b70, 
vdev=vdev@entry=0x5636c2853338) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/virtio/vhost.c:1262
  #7  0x00005636c08db2a8 in vhost_net_stop_one (net=0x5636c1bf6b70, 
dev=dev@entry=0x5636c2853338) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/net/vhost_net.c:293
  #8  0x00005636c08dbe5b in vhost_net_stop (dev=dev@entry=0x5636c2853338, 
ncs=0x5636c209d110, total_queues=total_queues@entry=1) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/net/vhost_net.c:371
  #9  0x00005636c08d7745 in virtio_net_vhost_status (status=7 '\a', 
n=0x5636c2853338) at /build/qemu-7I4i1R/qemu-2.5+dfsg/hw/net/virtio-net.c:150
  #10 virtio_net_set_status (vdev=<optimized out>, status=<optimized out>) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/hw/net/virtio-net.c:162
  #11 0x00005636c08ec42c in virtio_set_status (vdev=0x5636c2853338, 
val=<optimized out>) at /build/qemu-7I4i1R/qemu-2.5+dfsg/hw/virtio/virtio.c:624
  #12 0x00005636c098fed2 in vm_state_notify (running=running@entry=0, 
state=state@entry=RUN_STATE_SHUTDOWN) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/vl.c:1605
  #13 0x00005636c089172a in do_vm_stop (state=RUN_STATE_SHUTDOWN) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/cpus.c:724
  #14 vm_stop (state=RUN_STATE_SHUTDOWN) at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/cpus.c:1407
  #15 0x00005636c085d240 in main_loop_should_exit () at 
/build/qemu-7I4i1R/qemu-2.5+dfsg/vl.c:1883
  #16 main_loop () at /build/qemu-7I4i1R/qemu-2.5+dfsg/vl.c:1931
  #17 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) 
at /build/qemu-7I4i1R/qemu-2.5+dfsg/vl.c:4683

  [test case]

  unfortunately since this is a race condition, it's very hard to
  arbitrarily reproduce; it depends very much on the overall
  configuration of the guest as well as how exactly it's shut down -
  specifically, its vhost user net must be closed from the host side at
  a specific time during qemu shutdown.

  I have someone with such a setup who has reported to me their setup is
  able to reproduce this reliably, but the config is too complex for me
  to reproduce so I have relied on their reproduction and testing to
  debug and craft the patch for this.

  [regression potential]

  the change adds a flag to prevent repeated calls to vhost_net_stop().
  This also prevents any calls to vhost_net_cleanup() from
  net_vhost_user_event().  Any regression would be seen when stopping
  and/or cleaning up a vhost net.  Regressions might include failure to
  hot-remove a vhost net from a guest, or failure to cleanup (i.e. mem
  leak), or crashes during cleanup or stopping a vhost net.

  [other info]

  this was originally seen in the 2.5 version of qemu - specifically,
  the UCA version in trusty-mitaka (which uses the xenial qemu
  codebase).

  After discussion upstream, it appears this was fixed upstream by
  commit e7c83a885f8, which is included starting in version 2.9.
  However, this commit depends on at least commit 5345fdb4467, and
  likely more other previous commits, which make widespread code changes
  and are unsuitable to backport.  Therefore this seems like it should
  be specifically worked around in the Xenial qemu codebase.

  
  The specific race condition for this (in the qemu 2.5 code version) is:

  as shown in above bt, thread A starts shutting down qemu, e.g.:

  vm_stop->do_vm_stop->vm_state_notify
    virtio_set_status
      virtio_net_set_status
        virtio_net_vhost_status

  in this function, code gets to an if-else check for
  (!n->vhost_started), which is false (i.e. vhost_started is true) and
  enters the else code block, which calls vhost_net_stop() and then sets
  n->vhost_started to false.

  While thread A is inside vhost_net_stop(), thread B is triggered by
  the vhost net chr handler with a user event and calls:

  net_vhost_user_event
    qmp_set_link (from case CHR_EVENT_CLOSED)
      virtio_net_set_link_status (via ->link_status_changed)
        virtio_net_set_status
          virtio_net_vhost_status

  notice thread B has now reached the same function that thread A is in;
  since the checks in the function have not changed, thread B follows
  the same path that thread A followed, and enters vhost_net_stop().

  Since thread A has already shut down and cleaned up some of the
  internals, once thread B starts trying to also clean up things, it
  segfaults as the shown in the bt.

  Avoiding only this duplicate call to vhost_net_stop() is required, but
  not enough - let's continue to look at what thread B does after its
  call to qmp_set_link() returns:

  net_vhost_user_event
    vhost_user_stop
      vhost_net_cleanup
        vhost_dev_cleanup

  However, in main() qemu registers atexit(net_cleanup()), which does:
  net_cleanup
    qemu_del_nic (or qemu_del_net_client, depending on ->type)
      qemu_cleanup_net_client
        vhost_user_cleanup (via ->cleanup)
          vhost_net_cleanup
            vhost_dev_cleanup

  and the duplicate vhost_dev_cleanup fails assertions since things were
  already cleaned up.  Additionally, if thread B's call to
  vhost_dev_cleanup() comes before thread A finishes vhost_net_stop(),
  then that will call vhost_dev_stop() and vhost_disable_notifiers()
  which both try to access things that have been freed/cleared/disabled
  by vhost_dev_cleanup().

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1823458/+subscriptions

[Qemu-devel] [Bug 1823458] Re: race condition between vhost_net_stop and CHR_EVENT_CLOSED on shutdown crashes qemu

Reply via email to