On Wed, Oct 23, 2019 at 11:30 AM Cong Wang <xiyou.wangc...@gmail.com> wrote:
>
> On Wed, Oct 23, 2019 at 11:14 AM Eric Dumazet <eduma...@google.com> wrote:
> > > In case you misunderstand, the CPU profiling I used is captured
> > > during 256 parallel TCP_STREAM.
> >
> > When I asked you the workload, you gave me TCP_RR output, not TCP_STREAM :/
> >
> > "A single netperf TCP_RR could _also_ confirm the improvement:"
>
> I guess you didn't understand what "also" mean? The improvement
> can be measured with both TCP_STREAM and TCP_RR, only the
> CPU profiling is done with TCP_STREAM.
>

Except that I could not measure any gain with TCP_RR, which is expected,
since TCP_RR will not use RTO and TLP at the same time.

If you found that we were setting both RTO and TLP when sending 1-byte message,
we need to fix the stack, instead of working around it.

> BTW, I just tested an unpatched kernel on a machine with 64 CPU's,
> turning on/off TLP makes no difference there, so this is likely related
> to the number of CPU's or hardware configurations. This explains
> why you can't reproduce it on your side, so far I can only reproduce
> it on one particular hardware platform too, but it is real.
>

I have hosts with 112 cpus, I can try on them, but it will take some time.

Reply via email to