On Fri, Jun 26, 2020 at 3:46 AM Maxim Mikityanskiy <maxi...@mellanox.com> wrote:
>
> HTB doesn't scale well because of contention on a single lock, and it
> also consumes CPU. Mellanox hardware supports hierarchical rate limiting
> that can be leveraged by offloading the functionality of HTB.

True, essentially because it has to enforce a global rate limit with
link sharing.

There is a proposal of adding a new lockless shaping qdisc, which
you can find in netdev list.

>
> Our solution addresses two problems of HTB:
>
> 1. Contention by flow classification. Currently the filters are attached
> to the HTB instance as follows:
>
>     # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
>     classid 1:10
>
> It's possible to move classification to clsact egress hook, which is
> thread-safe and lock-free:
>
>     # tc filter add dev eth0 egress protocol ip flower dst_port 80
>     action skbedit priority 1:10
>
> This way classification still happens in software, but the lock
> contention is eliminated, and it happens before selecting the TX queue,
> allowing the driver to translate the class to the corresponding hardware
> queue.
>
> Note that this is already compatible with non-offloaded HTB and doesn't
> require changes to the kernel nor iproute2.
>
> 2. Contention by handling packets. HTB is not multi-queue, it attaches
> to a whole net device, and handling of all packets takes the same lock.
> Our solution offloads the logic of HTB to the hardware and registers HTB
> as a multi-queue qdisc, similarly to how mq qdisc does, i.e. HTB is
> attached to the netdev, and each queue has its own qdisc. The control
> flow is performed by HTB, it replicates the hierarchy of classes in
> hardware by calling callbacks of the driver. Leaf classes are presented
> by hardware queues. The data path works as follows: a packet is
> classified by clsact, the driver selectes the hardware queue according
> to its class, and the packet is enqueued into this queue's qdisc.

Are you sure the HTB algorithm could still work even after you
kinda make each HTB class separated? I think they must still share
something when they borrow bandwidth from each other. This is why I
doubt you can simply add a ->attach() without touching the core
algorithm.

Thanks.

Reply via email to