On 2/14/19 9:01 AM, David Miller wrote: > From: Jose Abreu <jose.ab...@synopsys.com> > Date: Wed, 13 Feb 2019 18:00:43 +0100 > >> Commit 8fce33317023 introduced the concept of NAPI per-channel and >> independent cleaning of TX path. >> >> This is currently breaking performance in some cases. The scenario >> happens when all packets are being received in Queue 0 but the TX is >> performed in Queue != 0. >> >> I didn't look very deep but it seems that NAPI for Queue 0 will clean >> the RX path but as TX is in different NAPI, this last one is called at a >> slower rate which kills performance in TX. I suspect this is due to TX >> cleaning takes much longer than RX and because NAPI will get canceled >> once we return with 0 budget consumed (e.g. when TX is still not done it >> will return 0 budget). >> >> Fix this by looking at all TX channels in NAPI poll function. >> >> Signed-off-by: Jose Abreu <joab...@synopsys.com> >> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue >> races") > > No this isn't right. > > The TX interrupt events for Queue != 0 should clean up the TX packets > on those queues. > > Furthermore you are breaking the locality of the TX processing. > > I'm not applying this, sorry.
Agreed, why don't you create per-queue NAPI instances such that they are all independent and can complete their TX completion/RX processing entirely separately? -- Florian