Re: [PATCH net-next v2 2/4] sch_htb: Hierarchical QoS hardware offload

Maxim Mikityanskiy Mon, 14 Dec 2020 07:14:16 -0800

On 2020-12-11 21:16, Cong Wang wrote:

On Fri, Dec 11, 2020 at 7:26 AM Maxim Mikityanskiy <maxi...@mellanox.com> wrote:


HTB doesn't scale well because of contention on a single lock, and it
also consumes CPU. This patch adds support for offloading HTB to
hardware that supports hierarchical rate limiting.

This solution addresses two main problems of scaling HTB:

1. Contention by flow classification. Currently the filters are attached
to the HTB instance as follows:


I do not think this is the reason, tcf_classify() has been called with RCU
only on the ingress side for a rather long time. What contentions are you
talking about here?

When one attaches filters to HTB, tcf_classify is called fromhtb_classify, which is called from htb_enqueue, which is called with theroot spinlock of the qdisc taken.


     # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
     classid 1:10

It's possible to move classification to clsact egress hook, which is
thread-safe and lock-free:

     # tc filter add dev eth0 egress protocol ip flower dst_port 80
     action skbedit priority 1:10

This way classification still happens in software, but the lock
contention is eliminated, and it happens before selecting the TX queue,
allowing the driver to translate the class to the corresponding hardware
queue.


Sure, you can use clsact with HTB, or any combinations you like, but you
can't assume your HTB only works with clsact, can you?

The goal is to eliminate the root lock from the datapath, and thetraditional filters attached to the HTB itself are handled under thatlock. I believe it's a sane limitation, given that the offloaded mode isa new mode of operation, it's opt-in, and it may also have additionalhardware-imposed limitations.


Note that this is already compatible with non-offloaded HTB and doesn't
require changes to the kernel nor iproute2.

2. Contention by handling packets. HTB is not multi-queue, it attaches
to a whole net device, and handling of all packets takes the same lock.
When HTB is offloaded, its algorithm is done in hardware. HTB registers
itself as a multi-queue qdisc, similarly to mq: HTB is attached to the
netdev, and each queue has its own qdisc. The control flow is still done
by HTB: it calls the driver via ndo_setup_tc to replicate the hierarchy
of classes in the NIC. Leaf classes are presented by hardware queues.
The data path works as follows: a packet is classified by clsact, the
driver selects a hardware queue according to its class, and the packet
is enqueued into this queue's qdisc.


I do _not_ read your code, from what you describe here, it sounds like
you just want a per-queue rate limit, instead of a global one. So why
bothering HTB whose goal is a global rate limit?

I would disagree. HTB's goal is hierarchical rate limits with borrowing.Sure, it can be used just to set a global limit, but it's main purposeis creating a hierarchy of classes.

And yes, we are talking about the whole netdevice here, not aboutper-queue limits (we already support per-queue rate limits with themeans of tx_maxrate, so we wouldn't need any new code for that). Thetree of classes is global for the whole netdevice, with hierarchy andborrowing supported. These additional send queues can be considered ashardware objects that represent offloaded leaf traffic classes (whichcan be extended to multiple queues per class).

So, we are really offloading HTB functionality here, not just using HTBinterface for something else (something simpler). I hope it soundsbetter for you now.

And doesn't TBF already work with mq? I mean you can attach it as
a leaf to each mq so that the tree lock will not be shared either, but you'd
lose the benefits of a global rate limit too.

Yes, I'd lose not only the global rate limit, but also multi-levelhierarchical limits, which are all provided by this HTB offload - that'swhy TBF is not really a replacement for this feature.

EDT does basically the same,
but it never claims to completely replace HTB. ;)

Thanks.

Re: [PATCH net-next v2 2/4] sch_htb: Hierarchical QoS hardware offload

Reply via email to