From: Andi Kleen <[EMAIL PROTECTED]>
Date: Wed, 1 Feb 2006 19:28:46 +0100

> http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf

I did a writeup in my blog about all of this, another good
reason to actively follow my blog:

        http://vger.kernel.org/~davem/cgi-bin/blog.cgi/index.html

Go read.

> -Andi (who prefers sourceware over slideware)

People are definitely hung up on the details, and that means
they are analyzing Van's work from the absolute _wrong_ angle.

This surprised me, what I expected was for anyone knowledgable about
networking to get this immediately, and as for the details, have an
attitude of "I don't care how, let's find a way to make this work!"

But since you're so hung up on the details, the basic idea is that
there is a tiny classifier in the RX IRQ processing of the driver.  We
have to touch that first cache line of the packet headers anyway, so
the classification comes for free.  You'll notice that even though
he's running this tiny classifier in the hard IRQ context, in order to
put the packet on the right RX net channel, IRQ overhead remains the
same.

So when a TCP socket enters established state, we add an entry into
the classifier.  The classifier is even smart enough to look for
a listening socket if the fully established classification fails.

Van is not against NAPI, in fact he's taking NAPI to the next level.
Softirq handling is overhead, and as this work shows, it is totally
unnecessary overhead.

Yes we do TCP prequeue now, and that's where the second stage net
channel stuff hooks into.  But prequeue as we have it now is not
enough, we still run softirq, and IP input processing from softirq not
from user socket context.  The RX net channel bypasses all of that
crap.

The way we do softirq now we can feed one cpu with softirq work given
a single card, with Van's stuff we can feed socket users on multiple
cpus with a single card.  The net channel data structure SMP
friendliness really helps here.

In one shot it does the input route lookup and the socket lookup.  We
just attach the packet to the socket's RX net channel, all from hard
IRQ context, at zero cost (see above).  This is just like the grand
unified flow cache idea that we've been tossing around for the past
few years.

And the beauty of all of this is that it complements ideas like LRO,
I/O AT, and cpu architectures like Niagara.

How in the world can you not understand how incredible this is?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to