On 2020-12-15 18:37, Jamal Hadi Salim wrote:
On 2020-12-14 3:30 p.m., Maxim Mikityanskiy wrote:
On 2020-12-14 21:35, Cong Wang wrote:
On Mon, Dec 14, 2020 at 7:13 AM Maxim Mikityanskiy <maxi...@nvidia.com> wrote:

On 2020-12-11 21:16, Cong Wang wrote:
On Fri, Dec 11, 2020 at 7:26 AM Maxim Mikityanskiy <maxi...@mellanox.com> wrote:




Interesting, please explain how your HTB offload still has a global rate
limit and borrowing across queues?

Sure, I will explain that.

I simply can't see it, all I can see
is you offload HTB into each queue in ->attach(),

In the non-offload mode, the same HTB instance would be attached to all queues. In the offload mode, HTB behaves like MQ: there is a root instance of HTB, but each queue gets a separate simple qdisc (pfifo). Only the root qdisc (HTB) gets offloaded, and when that happens, the NIC creates an object for the QoS root.

Then all configuration changes are sent to the driver, and it issues the corresponding firmware commands to replicate the whole hierarchy in the NIC. Leaf classes correspond to queue groups (in this implementation queue groups contain only one queue, but it can be extended),


FWIW, it is very valuable to be able to abstract HTB if the hardware
can emulate it (users dont have to learn about new abstracts).

Yes, that's the reason for using an existing interface (HTB) to configure the feature.

Since you are expressing a limitation above:
How does the user discover if they over-provisioned i.e single
queue example above?

It comes to the CPU usage. If the core that serves the queue is busy with sending packets 100% of time, you need more queues. Also, if the user runs more than one application belonging to the same class, and pins them to different cores, it makes sense to create more than one queue.

I'd like to emphasize that this is not a hard limitation. Our hardware and firmware supports multiple queues per class. What's needed is the support from the driver side and probably an additional parameter to tc class add to specify the number of queues to reserve.

If there are too many corner cases it may
make sense to just create a new qdisc.

and inner classes correspond to entities called TSARs.

The information about rate limits is stored inside TSARs and queue groups. Queues know what groups they belong to, and groups and TSARs know what TSAR is their parent. A queue is picked in ndo_select_queue by looking at the classification result of clsact. So, when a packet is put onto a queue, the NIC can track the whole hierarchy and do the HTB algorithm.


Same question above:
Is there a limit to the number of classes that can be created?

Yes, the commit message of the mlx5 patch lists the limitations of our NICs. Basically, it's 256 leaf classes and 3 levels of hierarchy.

IOW, if someone just created an arbitrary number of queues do they
get errored-out if it doesnt make sense for the hardware?

The current implementation starts failing gracefully if the limits are exceeded. The tc command won't succeed, and everything will roll back to the stable state, which was just before the tc command.

If such limits exist, it may make sense to provide a knob to query
(maybe ethtool)

Sounds legit, but I'm not sure what would be the best interface for that. Ethtool is not involved at all in this implementation, and AFAIK it doesn't contain any existing command for similar stuff. We could hook into set-channels and add new type of channels for HTB, but the semantics isn't very clear, because HTB queues != HTB leaf classes, and I don't know if it's allowed to extend this interface (if so, I have more thoughts of extending it for other purposes).

and if such limits can be adjusted it may be worth
looking at providing interfaces via devlink.

Not really. At the moment, there isn't a good reason to decrease the maximum limits. It would make sense if it could free up some resources for something else, but AFAIK it's not the case now.

Thanks,
Max

cheers,
jamal


cheers,
jamal

Reply via email to