On Mon, 06/29 11:39, Stefan Hajnoczi wrote: > On Fri, Jun 26, 2015 at 05:06:01PM +0800, Jason Wang wrote: > > > > > > On 06/25/2015 05:18 PM, Stefan Hajnoczi wrote: > > > e1000_can_receive() checks the link up status register bit. If the bit > > > is clear, packets will be queued and the peer may disable receive to > > > avoid wasting CPU reading packets that cannot be delivered. The queue > > > must be flushed once the link comes back up again. > > > > > > This patch fixes broken e1000 receive with Mac OS X Snow Leopard guests > > > and tap networking. Flushing the queue invokes the async send callback, > > > which re-enables tap fd read. > > > > > > Reported-by: Jonathan Liu <net...@gmail.com> > > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > > --- > > > hw/net/e1000.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/hw/net/e1000.c b/hw/net/e1000.c > > > index bab8e2a..5c6bcd0 100644 > > > --- a/hw/net/e1000.c > > > +++ b/hw/net/e1000.c > > > @@ -185,6 +185,9 @@ e1000_link_up(E1000State *s) > > > { > > > s->mac_reg[STATUS] |= E1000_STATUS_LU; > > > s->phy_reg[PHY_STATUS] |= MII_SR_LINK_STATUS; > > > + > > > + /* E1000_STATUS_LU is tested by e1000_can_receive() */ > > > + qemu_flush_queued_packets(qemu_get_queue(s->nic)); > > > } > > > > > > static bool > > > > This only solves the issue partially and just for e1000. After checking > > all can_receive() functions, another card with similar behaviour is vmxnet3. > > No, this bug is specific to e1000 (details on that below in case 2b). > > There are more issues though, so additional patches are welcome. > > > Looking at commit a90a7425cf592a3afeff3eaf32f543b83050ee5c ("tap: Drop > > tap_can_send") again. The commit disable tap read poll when > > qemu_net_queue_send() returns zero. Which is usually the following cases: > > > > 1) queue->delivering is 1 > > 2) qemu_can_send_packet() returns zero, which is: > > 2a) vm_running is false > > or > > 2b) can_receive() return zero > > 3) qemu_net_queue_deliver() returns zero, which is: > > 3a) nc->receive_disabled is true > > or > > 3b) info->receive_raw() or nc->receive->receive() returns zero > > > > This means we should enable tap read poll when one of those conditions > > is not existed. This patch fixes 2b) only for e1000. > > > > for 1, I'm not quite sure it's a real problem or how to fix. > > This is not a problem. When the delivery code returns it flushes the > queue. This will invoke send callback functions, so tap will re-enable > read poll. > > > for 2a, we may probably need a vm state change handler to flush the > > queue, otherwise network will stall after a stop and cont. > > Yes, I'm surprised the code isn't doing this already.
It's rare that a .can_receive checks vm_running. Why is it only necessary in virtio-net, or is it that other NIC models are wrong? > > > for 2b, we can either fixes the card that check link status in its > > can_receive() or just simply can qemu_flush_queued_packets() in set_link. > > No, this doesn't work because the e1000 link status bit is *not* > equivalent to nc->link_down. The link auto-negotiation code changes the > e1000 bit independently from nc->link_down using a timer. > > The net subsystem cannot know when this e1000-specific bit is supposed > to change. Therefore there is no general fix. > > Also, qemu_send_packet() drops packets while nc->link_down is true. > We're not supposed to queue so it shouldn't be necessary to flush > packets in set_link at the net subsystem level. > > > 3a and 3b looks ok, since this happen when there's no space for rx, > > after guest refill, qemu will call qemu_flush_queued_packets() to flush > > and re-enable tap read poll. > > > > 2a and 2b does not exits before this commit, since tap_send check > > qemu_can_send_packet() before. > > The delivery and queuing mechanism in QEMU is overcomplicated. The > semantics of .can_receive() and .receive() need to be clarified before > we can do real cleanup though: > > .receive() returning 0 is not quite the same as .can_receive() returning > false. > > When .receive() returns 0 this means the peer should wait before sending > more. In most cases the net queue only needs to hold 1 packet. But > there are cases like slirp where QEMU code generates packets and may not > be able to continue without calling qemu_send_packet() on more packets > (I'm not sure but I suspect this is the case). > > When .can_receive() returns false it also casues packets to be queued, > but NIC emulation code often includes more conditions in .can_receive() > than .receive() returning 0. > > Cleaning this up requires auditing all the NICs. Maybe we can get rid > of .can_receive() and just have a simplified .receive(): > > false - queue packet and tell peer to stop sending. Flush must be > called later to re-enable sending. > true - packet delivered or dropped Yes, I find that some of the .can_receive callbacks look OK for a simple dropping, for example net-hub and virtio-net. I can try this, or even better if we can split the work. As a starter maybe we can drop the .can_receive callback and move existing implementations to the beginning of .receive() - something that a few NICs already do. Fam