On 2019/4/10 下午10:33, David Woodhouse wrote:
On Wed, 2019-04-10 at 15:42 +0200, Toke Høiland-Jørgensen wrote:
That doesn't seem to make much difference at all; it's still dropping a
lot of packets because ptr_ring_produce() is returning non-zero.
I think you need try to stop the queue just in this case? Ideally we may
want to stop the queue when the queue is about to full, but we don't
have such helper currently.
I don't quite understand. If the ring isn't full after I've put a
packet into it... how can it be full subsequently? We can't end up in
tun_net_xmit() concurrently, right?
Note, NETIF_F_LLTX is used for tun. But this reminds me a commit from
Michael:
5d097109257c03a71845729f8db6b5770c4bbedc("tun: only queue packets on
device"). It looks it you want to recover the behavior IFF_ONE_QUEUE.
But that commit is kind of confusing since we did skb_orphan_frags()
anyway. Maybe Michael can say more on this.
I'm not (knowingly) using XDP.
Ideally we want to react when the queue starts building rather than when
it starts getting full; by pushing back on upper layers (or, if
forwarding, dropping packets to signal congestion).
This is precisely what my first accidental if (!ptr_ring_empty())
variant was doing, right? :)
But I give a try on your ptr_ring_full() patch on VM, looks like it
works (single flow), no packets were dropped by TAP anymore. How many
flows did you use?
In practice, this means tuning the TX ring to the *minimum* size it can
be without starving (this is basically what BQL does for Ethernet), and
keeping packets queued in the qdisc layer instead, where it can be
managed...
I was going to add BQL (as $SUBJECT may have caused you to infer) but
trivially adding the netdev_sent_queue() in tun_net_xmit() and
netdev_completed_queue() for xdp vs. skb in tun_do_read() was tripping
the BUG in dql_completed().
Something like https://lists.openwall.net/netdev/2012/11/12/67 ?
Thanks
I just ripped that part out and focused on
the queue stop/start and haven't gone back to it yet.