Hi Cristian, The way qos works just now should be feasible for dynamic targets. That is similar functions to rte_sched_port_enqueue() and rte_sched_port_dequeue() would be called. The first to enqueue the mbufs onto the queues the second to dequeue. The qos structures and scheduler don't need to be as functionally rich though. I would have thought a simple pipe with child nodes should suffice for most. That would allow each tunnel/session to be shaped and the queueing and drop logic inherited from what is there just now.
Thanks, Alan. -----Original Message----- From: Dumitrescu, Cristian [mailto:cristian.dumitre...@intel.com] Sent: Wednesday, December 07, 2016 7:52 PM To: Alan Robertson Cc: dev@dpdk.org; Thomas Monjalon Subject: RE: [dpdk-dev] [RFC] ethdev: abstraction layer for QoS hierarchical scheduler Hi Alan, Thanks for your comments! > Hi Cristian, > Looking at points 10 and 11 it's good to hear nodes can be dynamically added. Yes, many implementations allow on-the-fly remapping a node from one parent to another one, or simply adding more nodes post-initialization, so it is natural for the API to provide this. > We've been trying to decide the best way to do this for support of qos > on tunnels for some time now and the existing implementation doesn't > allow this so effectively ruled out hierarchical queueing for tunnel targets > on the output interface. > Having said that, has thought been given to separating the queueing from > being so closely > tied to the Ethernet transmit process ? When queueing on a tunnel for > example we may > be working with encryption. When running with an anti-reply window it is > really much > better to do the QOS (packet reordering) before the encryption. To > support this would it be possible to have a separate scheduler > structure which can be passed into the scheduling API ? This means > the calling code can hang the structure of whatever entity it wishes to > perform qos on, and we get dynamic target support (sessions/tunnels etc). Yes, this is one point where we need to look for a better solution. Current proposal attaches the hierarchical scheduler function to an ethdev, so scheduling traffic for tunnels that have a pre-defined bandwidth is not supported nicely. This question was also raised in VPP, but there tunnels are supported as a type of output interfaces, so attaching scheduling to an output interface also covers the tunnels case. Looks to me that nice tunnel abstractions are a gap in DPDK as well. Any thoughts about how tunnels should be supported in DPDK? What do other people think about this? > Regarding the structure allocation, would it be possible to make the > number of queues associated with a TC a compile time option which the > scheduler would accommodate ? > We frequently only use one queue per tc which means 75% of the space > allocated at the queueing layer for that tc is never used. This may > be specific to our implementation but if other implementations do the > same if folks could say we may get a better idea if this is a common case. > Whilst touching on the scheduler, the token replenishment works using > a division and multiplication obviously to cater for the fact that it > may be run after several tc windows have passed. The most commonly > used industrial scheduler simply does a lapsed on the tc and then adds > the bc. This relies on the scheduler being called within the tc > window though. It would be nice to have this as a configurable option since > it's much for efficient assuming the infra code from which it's called can > guarantee the calling frequency. This is probably feedback for librte_sched as opposed to the current API proposal, as the Latter is intended to be generic/implementation-agnostic and therefor its scope far exceeds the existing set of librte_sched features. Btw, we do plan using the librte_sched feature as the default fall-back when the HW ethdev is not scheduler-enabled, as well as the implementation of choice for a lot of use-cases where it fits really well, so we do have to continue evolve and improve librte_sched feature-wise and performance-wise. > I hope you'll consider these points for inclusion into a future road > map. Hopefully in the future my employer will increase the priority > of some of the tasks and a PR may appear on the mailing list. > Thanks, > Alan.