On 2020-12-15 18:37, Jamal Hadi Salim wrote:
On 2020-12-14 3:30 p.m., Maxim Mikityanskiy wrote:
On 2020-12-14 21:35, Cong Wang wrote:
On Mon, Dec 14, 2020 at 7:13 AM Maxim Mikityanskiy
<maxi...@nvidia.com> wrote:
On 2020-12-11 21:16, Cong Wang wrote:
On Fri, Dec 11, 2020 at 7:26 AM Maxim Mikityanskiy
<maxi...@mellanox.com> wrote:
Interesting, please explain how your HTB offload still has a global rate
limit and borrowing across queues?
Sure, I will explain that.
I simply can't see it, all I can see
is you offload HTB into each queue in ->attach(),
In the non-offload mode, the same HTB instance would be attached to
all queues. In the offload mode, HTB behaves like MQ: there is a root
instance of HTB, but each queue gets a separate simple qdisc (pfifo).
Only the root qdisc (HTB) gets offloaded, and when that happens, the
NIC creates an object for the QoS root.
Then all configuration changes are sent to the driver, and it issues
the corresponding firmware commands to replicate the whole hierarchy
in the NIC. Leaf classes correspond to queue groups (in this
implementation queue groups contain only one queue, but it can be
extended),
FWIW, it is very valuable to be able to abstract HTB if the hardware
can emulate it (users dont have to learn about new abstracts).
Yes, that's the reason for using an existing interface (HTB) to
configure the feature.
Since you are expressing a limitation above:
How does the user discover if they over-provisioned i.e single
queue example above?
It comes to the CPU usage. If the core that serves the queue is busy
with sending packets 100% of time, you need more queues. Also, if the
user runs more than one application belonging to the same class, and
pins them to different cores, it makes sense to create more than one queue.
I'd like to emphasize that this is not a hard limitation. Our hardware
and firmware supports multiple queues per class. What's needed is the
support from the driver side and probably an additional parameter to tc
class add to specify the number of queues to reserve.
If there are too many corner cases it may
make sense to just create a new qdisc.
and inner classes correspond to entities called TSARs.
The information about rate limits is stored inside TSARs and queue
groups. Queues know what groups they belong to, and groups and TSARs
know what TSAR is their parent. A queue is picked in ndo_select_queue
by looking at the classification result of clsact. So, when a packet
is put onto a queue, the NIC can track the whole hierarchy and do the
HTB algorithm.
Same question above:
Is there a limit to the number of classes that can be created?
Yes, the commit message of the mlx5 patch lists the limitations of our
NICs. Basically, it's 256 leaf classes and 3 levels of hierarchy.
IOW, if someone just created an arbitrary number of queues do they
get errored-out if it doesnt make sense for the hardware?
The current implementation starts failing gracefully if the limits are
exceeded. The tc command won't succeed, and everything will roll back to
the stable state, which was just before the tc command.
If such limits exist, it may make sense to provide a knob to query
(maybe ethtool)
Sounds legit, but I'm not sure what would be the best interface for
that. Ethtool is not involved at all in this implementation, and AFAIK
it doesn't contain any existing command for similar stuff. We could hook
into set-channels and add new type of channels for HTB, but the
semantics isn't very clear, because HTB queues != HTB leaf classes, and
I don't know if it's allowed to extend this interface (if so, I have
more thoughts of extending it for other purposes).
and if such limits can be adjusted it may be worth
looking at providing interfaces via devlink.
Not really. At the moment, there isn't a good reason to decrease the
maximum limits. It would make sense if it could free up some resources
for something else, but AFAIK it's not the case now.
Thanks,
Max
cheers,
jamal
cheers,
jamal