Re: [RFC PATCH net-next 0/6] implement kthread based napi poll

Jakub Kicinski Tue, 29 Sep 2020 12:19:41 -0700

On Mon, 28 Sep 2020 19:43:36 +0200 Eric Dumazet wrote:
> Wei, this is a very nice work.
> 
> Please re-send it without the RFC tag, so that we can hopefully merge it ASAP.


The problem is for the application I'm testing with this implementation
is significantly slower (in terms of RPS) than Felix's code:

              |        L  A  T  E  N  C  Y       |  App   |     C P U     |
       |  RPS |   AVG  |  P50  |   P99  |   P999 | Overld |  busy |  PSI  |
thread | 1.1% | -15.6% | -0.3% | -42.5% |  -8.1% | -83.4% | -2.3% | 60.6% |
work q | 4.3% | -13.1% |  0.1% | -44.4% |  -1.1% |   2.3% | -1.2% | 90.1% |
TAPI   | 4.4% | -17.1% | -1.4% | -43.8% | -11.0% | -60.2% | -2.3% | 46.7% |

thread is this code, "work q" is Felix's code, TAPI is my hacks.

The numbers are comparing performance to normal NAPI.

In all cases (but not the baseline) I configured timer-based polling
(defer_hard_irqs), with around 100us timeout. Without deferring hard
IRQs threaded NAPI is actually slower for this app. Also I'm not
modifying niceness, this again causes application performance
regression here.

1 NUMA node. 18 NAPI instances each is around 25% of a single CPU.

I was initially hoping that TAPI would fit nicely as an extension 
of this code, but I don't think that will be the case.

Are there any assumptions you're making about the configuration that 
I should try to replicate?

Re: [RFC PATCH net-next 0/6] implement kthread based napi poll

Reply via email to