On 5/8/2018 10:19 AM, Alexander Duyck wrote: > On Tue, May 8, 2018 at 10:07 AM, Eric Dumazet <eric.duma...@gmail.com> wrote: >> >> >> On 05/08/2018 09:02 AM, Alexander Duyck wrote: >>> On Tue, May 8, 2018 at 8:15 AM, Tom Herbert <t...@herbertland.com> wrote: >>>> On Thu, Apr 19, 2018 at 7:41 PM, Eric Dumazet <eduma...@google.com> wrote: >>>>> On Thu, Apr 19, 2018 at 6:07 PM Amritha Nambiar >>>>> <amritha.namb...@intel.com> >>>>> wrote: >>>>> >>>>>> This patch series implements support for Tx queue selection based on >>>>>> Rx queue map. This is done by configuring Rx queue map per Tx-queue >>>>>> using sysfs attribute. If the user configuration for Rx queues does >>>>>> not apply, then the Tx queue selection falls back to XPS using CPUs and >>>>>> finally to hashing. >>>>> >>>>>> XPS is refactored to support Tx queue selection based on either the >>>>>> CPU map or the Rx-queue map. The config option CONFIG_XPS needs to be >>>>>> enabled. By default no receive queues are configured for the Tx queue. >>>>> >>>>>> - /sys/class/net/eth0/queues/tx-*/xps_rxqs >>>>> >>>>>> This is to enable sending packets on the same Tx-Rx queue pair as this >>>>>> is useful for busy polling multi-threaded workloads where it is not >>>>>> possible to pin the threads to a CPU. This is a rework of Sridhar's >>>>>> patch for symmetric queueing via socket option: >>>>>> https://www.spinics.net/lists/netdev/msg453106.html >>>>> >>>> I suspect this is an artifact of flow director which I believe >>>> required queue pairs to be able to work (i.e. receive queue chose >>>> hardware is determined send queue). But that was only required because >>>> of hardware design, I don't see the rationale for introducing queue >>>> pairs in the software stack. There's no need to correlate the transmit >>>> path with receive path, no need to enforce a 1-1 mapping between RX >>>> and TX queues, and the OOO mitigations should be sufficient when TX >>>> queue changes for a flow. >>>> >>>> Tom >>> >>> If I am not mistaken I think there are benefits to doing this sort of >>> thing with polling as it keeps the Tx work locked into the same queue >>> pair that a given application is polling on. So as a result you can >>> keep the interrupts contained to the queue pair that is being busy >>> polled on and if the application cleans up the packets during the busy >>> poll it ends up being a net savings in terms of both latency and power >>> since the Tx clean-up happens sooner, and it happens on the queue that >>> is already busy polling instead of possibly triggering an interrupt on >>> another CPU. >>> >>> So for example in the case of routing and bridging workloads we >>> already had code that would take the Rx queue and associate it to a Tx >>> queue. One of the ideas behind doing this is to try and keep the CPU >>> overhead low by having a 1:1 mapping. In the case of this code we >>> allow for a little more flexibility in that you could have >>> many-to-many mappings but the general idea and common use case is the >>> same which is a 1:1 mapping. >> >> >> I thought we had everything in place to be able to have this already. >> >> Setting IRQ affinities and XPS is certainly something doable. >> >> This is why I wanted a proper documentation of yet another way to reach the >> same behavior. > > IRQ affinities and XPS work for pure NAPI setups, but the problem is > you have to also do application affinity in the case of busy polling > which can provide some additional challenges since then you have to > add code in your application to associate a given queue/CPU to a given > application thread. I believe this is a way of simplifying this. > > I agree on the documentation aspect. The usage of this should be well > documented as well as the why of using it. > > - Alex > I'll submit another version of the series with the documentation added.
- Amritha