On Sat, Mar 08, 2014 at 04:00:43PM +0100, Stefan Hajnoczi wrote:
> The net subsystem has a control flow mechanism so peer NetClientStates
> can tell each other to stop sending packets.  This is used to stop
> monitoring the tap file descriptor for incoming packets if the guest rx
> ring has no spare buffers.
> 
> There is a corner case when tap_can_send() is true at the beginning of
> an event loop iteration but becomes false before the tap_send() fd
> handler is invoked.
> 
> tap_send() will read the packet from the tap file descriptor and attempt
> to send it.  The net queue will hold on to the packet and return 0,
> indicating that further I/O is not possible.  tap then stops monitoring
> the file descriptor for reads.
> 
> This is unlike the normal case where tap_can_send() is the same before
> and during the event loop iteration.  The event loop would simply not
> monitor the file descriptor if tap_can_send() returns true.  Upon next
> iteration it would check tap_can_send() again and begin monitoring if we
> can send.
> 
> The deadlock happens because tap_send() explicitly disabled read_poll.
> This is done with the expectation that the peer will call
> qemu_net_queue_flush().  But hw/net/virtio-net.c does not monitor
> vm_running transitions and issue the flush.  Hence we're left with a
> broken tap device.
> 
> Cc: qemu-sta...@nongnu.org
> Reported-by: Neil Skrypuch <n...@tembosocial.com>
> Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
> ---
>  net/tap.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Applied to my net tree:
https://github.com/stefanha/qemu/commits/net

Stefan

Reply via email to