On Sat, 23 Dec 2006, Robert Watson wrote:
On Sat, 23 Dec 2006, John Polstra wrote:
That said, dropping and regrabbing the driver lock in the rxeof routine of
any driver is bad. It may be safe to do, but it incurs horrible
performance penalties. It essentially allows the time-critical, high
priority RX path to be constantly preempted by the lower priority if_start
or if_ioctl paths. Even without this preemption and priority inversion,
you're doing an excessive number of expensive lock ops in the fast path.
It's not very time-critical or high priority for bge or any other device
that has a reasonably large rx ring. With a ring size of 512 and an rx
interrupt occuring not too near the end (say at half way), you have 256
packet times to finish processing the interrupt. For normal 1518 byte
packets at 1Gbps, 256 packet times is about 3 mS. bge's rx ring size
is actually larger than 512 for most hardware.
We currently make this a lot worse than it needs to be by handing off the
received packets one at a time, unlocking and relocking for every packet.
It would be better if the driver's receive interrupt handler would harvest
all of the incoming packets and queue them locally. Then, at the end, hand
off the linked list of packets to the network stack wholesale, unlocking
and relocking only once. (Actually, the list could probably be handed off
at the very end of the interrupt service routine, after the driver has
already dropped its lock.) We wouldn't even need a new primitive, if
ether_input() and the other if_input() functions were enhanced to deal with
a possible list of packets instead of just a single one.
Do a bit more than that and you have reinvented fast interrupt handling
:-). However, with large buffers the complications for fast interrupt
handling are not very needed. A fast interrupt handler would queue
all the packets (taking care not to be blocked by normal spinlocks etc.,
unlike the "fast" interrupt handlers in -current) and then schedule a
low[er] priority thread to finish the handling. With large buffers, the
lower priority thread can just be scheduled immediately.
I try this experiement every few years, and generally don't measure much
improvement. I'll try it again with 10gbps early next year once back in the
office again. The more interesting transition is between the link layer and
the network layer, which is high on my list of topics to look into in the
next few weeks. In particular, reworking the ifqueue handoff. The tricky
bit is balancing latency, overhead, and concurrency...
These are very unbalanced now, so you don't have to worry about breaking
the balance :-). I normal unbalance to optimize latency (45-60 uS ping
latency).
Bruce
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"