On Sat, Mar 08, 2014 at 04:00:43PM +0100, Stefan Hajnoczi wrote: > The net subsystem has a control flow mechanism so peer NetClientStates > can tell each other to stop sending packets. This is used to stop > monitoring the tap file descriptor for incoming packets if the guest rx > ring has no spare buffers. > > There is a corner case when tap_can_send() is true at the beginning of > an event loop iteration but becomes false before the tap_send() fd > handler is invoked. > > tap_send() will read the packet from the tap file descriptor and attempt > to send it. The net queue will hold on to the packet and return 0, > indicating that further I/O is not possible. tap then stops monitoring > the file descriptor for reads. > > This is unlike the normal case where tap_can_send() is the same before > and during the event loop iteration. The event loop would simply not > monitor the file descriptor if tap_can_send() returns true. Upon next > iteration it would check tap_can_send() again and begin monitoring if we > can send. > > The deadlock happens because tap_send() explicitly disabled read_poll. > This is done with the expectation that the peer will call > qemu_net_queue_flush(). But hw/net/virtio-net.c does not monitor > vm_running transitions and issue the flush. Hence we're left with a > broken tap device. > > Cc: qemu-sta...@nongnu.org > Reported-by: Neil Skrypuch <n...@tembosocial.com> > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > --- > net/tap.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-)
Applied to my net tree: https://github.com/stefanha/qemu/commits/net Stefan