On Thu, 2023-12-14 at 17:46 -0800, Jakub Kicinski wrote: > On Thu, 14 Dec 2023 21:29:51 +0100 Paolo Abeni wrote: > > Together with Simon, I spent some time on the above. We think the > > ndo_setup_tc(TC_SETUP_QDISC_TBF) hook could be used as common basis for > > this offloads, with some small extensions (adding a 'max_rate' param, > > too). > > uAPI aside, why would we use ndo_setup_tc(TC_SETUP_QDISC_TBF) > to implement common basis? > > Is it not cleaner to have a separate driver API, with its ops > and capabilities?
We understand one of the end goal is consolidating the existing rate- related in kernel interfaces. Adding a new one does not feel a good starting to reach that goal, see [1] & [2] ;). ndo_setup_tc() feels like the natural choice for H/W offload and TBF is the existing interface IMHO nearest to the requirements here. The devlink rate API could be a possible alternative... > > The idea would be: > > - 'fixing' sch_btf so that the s/w path became a no-op when h/w offload > > is enabled > > - extend sch_btf to support max rate > > - do the relevant ice implementation > > - ndo_set_tx_maxrate could be replaced with the mentioned ndo call (the > > latter interface is a strict super-set of former) > > - ndo_set_vf_rate could also be replaced with the mentioned ndo call > > (with another small extension to the offload data) > > > > I think mqprio deserves it's own separate offload interface, as it > > covers multiple tasks other than shaping (grouping queues and mapping > > priority to classes) > > > > In the long run we could have a generic implementation of the > > ndo_setup_tc(TC_SETUP_QDISC_TBF) in term of devlink rate adding a > > generic way to fetch the devlink_port instance corresponding to the > > given netdev and mapping the TBF features to the devlink_rate API. > > > > Not starting this due to what Jiri mentioned [1]. > > Jiri, AFAIU, is against using devlink rate *uAPI* to configure network > rate limiting. That's separate from the internal representation. ... with a couples of caveats: 1) AFAICS devlink (and/or devlink_port) does not have fine grained, per queue representation and intel want to be able to configure shaping on per queue basis. I think/hope we don't want to bring the discussion to extending the devlink interface with queue support, I fear that will block us for a long time. Perhaps I’m missing or misunderstanding something here. Otherwise in retrospect this looks like a reasonable point to completely avoid devlink here. 2) My understanding of Jiri statement was more restrictive. @Jiri it would great if could share your genuine interpretation: are you ok with using the devlink_port rate API as a basis to replace ndo_set_tx_maxrate() (via dev->devlink_port->devlink->) and possibly ndo_set_vf_rate(). Note the given the previous point, this option would still feel problematic. Cheers, Paolo [1] https://xkcd.com/927/ [2] https://www.youtube.com/watch?v=f8kO_L-pDwo _______________________________________________ Intel-wired-lan mailing list Intel-wired-lan@osuosl.org https://lists.osuosl.org/mailman/listinfo/intel-wired-lan