Re: RFC: possible NAPI improvements to reduce interrupt rates for low traffic rates

James Chapman Fri, 07 Sep 2007 02:32:44 -0700

jamal wrote:

On Thu, 2007-06-09 at 15:16 +0100, James Chapman wrote:

>> First, do we need to encourage consistency in NAPI poll drivers?

not to stiffle the discussion, but Stephen Hemminger is planning to
write a new howto; that would be a good time to bring up the topic. Thechallenge is that there may be hardware issues that will result in small
deviations.

Ok.

When a device is in polled mode while idle, there are 2 scheduling cases to 
consider:-
1. One or more other netdevs is not idle and is consuming quota on each poll. The net_rxsoftirqwill loop until the next jiffy tick or when quota is exceeded, calling each devicein its polledlist. Since the idle device is still in the poll list, it will be polled very rapidly.
One suggestion on limiting the amount of polls is to actually have the
driver chew something off the quota even on empty polls - easier by just
changing the driver. A simple case will be say 1 packet (more may make
more sense, machine dependent) every time poll is invoked by the core.

I wanted to minimize the impact on devices that do have work to do. Butit's worth investigating. Thanks for the suggestion.

In testing, I see significant reduction in interrupt rate for typical traffic patterns. A flood ping,for example, keeps the device in polled mode, generating no interrupts.
Must be a fast machine.

Not really. I used 3-year-old, single CPU x86 boxes with e100interfaces. The idle poll change keeps them in polled mode. Without idlepoll, I get twice as many interrupts as packets, one for txdone and onefor rx. NAPI is continuously scheduled in/out.

In a test, 8510 packets are sent/received versus 6200 previously;
The other packets are dropped?

No. Since I did a flood ping from the machine under test, the improvedlatency meant that the ping response was handled more quickly, causingthe next packet to be sent sooner. So more packets were transmitted inthe allotted time (10 seconds).

What are the rtt numbers like?


With current NAPI:

rtt min/avg/max/mdev = 0.902/1.843/101.727/4.659 ms, pipe 9, ipg/ewma1.611/1.421 ms


With idle poll changes:

rtt min/avg/max/mdev = 0.898/1.117/28.371/0.689 ms, pipe 3, ipg/ewma1.175/1.236 ms

CPU load is 100% versus 62% previously;
not good.

But the CPU has done more work. The flood ping will always showincreased CPU with these changes because the driver always stays in theNAPI poll list. For typical LAN traffic, the average CPU usage doesn'tincrease as much, though more measurements would be useful.

Your results above showed decreased tput and increased cpu - did you
mistype that?

I didn't use clear English. :) I'm seeing increased throughput, mostlybecause latency is improved. The increased cpu is partly because of theincreased throughput, and partly because ksoftirqd stays busy longer.

despite the CPU load being increased. For a system whose main job is processing networktraffic quickly, like an embedded router or a network server, this approach might be verybeneficial.
I am not sure i buy that James;-> The router types really have not much
of a challenge in this area.

The problem I started thinking about was the one where NAPI thrashesin/out of polled mode at higher and higher rates as network interfacespeeds and CPU speeds increase. A flood ping demonstrates this even on100M links on my boxes. Networking boxes want consistentperformance/latency for all traffic patterns and they need to avoidinterrupt livelock. Current practice seems to be to use hardwareinterrupt mitigation or timers to limit interrupt rate but this justhurts latency, as you noted. So I'm trying to find a way to limit theNAPI interrupt rate without increasing latency. My comment about thisapproach being suitable for routers and networked servers is that theseboxes care more about minimizing packet latency than they do aboutwasting CPU cycles by polling idle devices.

You are doing the right thing by following the path on perfomance
analysis. I hope you dont get discouraged because the return on
investment may be very low in such work - the majority of the work is in

the testing and analysis (not in puking code endlessly).

Thanks for your feedback. The challenge will be finding the time to dothis work. :)


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC: possible NAPI improvements to reduce interrupt rates for low traffic rates

Reply via email to