On Wed, Apr 13, 2016 at 7:32 PM, Yuanhan Liu <yuanhan....@linux.intel.com> wrote: >> >> > I'm asking because I found a seg fault issue sometimes, >> > due to opaque is NULL. > > Oh, I was wrong, it's u being NULL, but not opaque. >> > >> >> I would be interested to see the backtrace or have a reproducer. > > It's a normal test steps: start a vhost-user switch (I'm using DPDK > vhost-switch example), kill it, and wait for a while (something like > more than 10s or even longer), then I saw a seg fault: > > (gdb) p dev > $4 = (struct vhost_dev *) 0x555556571bf0 > (gdb) p u > $5 = (struct vhost_user *) 0x0 > (gdb) where > #0 0x0000555555798612 in slave_read (opaque=0x555556571bf0) > at /home/yliu/qemu/hw/virtio/vhost-user.c:539 > #1 0x0000555555a343a4 in aio_dispatch (ctx=0x55555655f560) at > /home/yliu/qemu/aio-posix.c:327 > #2 0x0000555555a2738b in aio_ctx_dispatch (source=0x55555655f560, > callback=0x0, user_data=0x0) > at /home/yliu/qemu/async.c:233 > #3 0x00007ffff51032a6 in g_main_context_dispatch () from > /lib64/libglib-2.0.so.0 > #4 0x0000555555a3239e in glib_pollfds_poll () at > /home/yliu/qemu/main-loop.c:213 > #5 0x0000555555a3247b in os_host_main_loop_wait (timeout=29875848) at > /home/yliu/qemu/main-loop.c:258 > #6 0x0000555555a3252b in main_loop_wait (nonblocking=0) at > /home/yliu/qemu/main-loop.c:506 > #7 0x0000555555846e35 in main_loop () at /home/yliu/qemu/vl.c:1934 > #8 0x000055555584e6bf in main (argc=31, argv=0x7fffffffe078, > envp=0x7fffffffe178) > at /home/yliu/qemu/vl.c:4658 >
This patch set doesn't try to handle crashes from backend. This would require a much more detailed study of the existing code path. A lot of places assume the backend is fully working as expected. I think handling backend crashes should be a different, later, patch set. -- Marc-André Lureau