Re: Van Jacobson net channels

2006-02-03 Thread Greg Banks
On Fri, 2006-02-03 at 18:48, Andi Kleen wrote: > On Friday 03 February 2006 02:07, Greg Banks wrote: > > > > (Don't ask for code - it's not really in an usable state) > > > > Sure. I'm looking forward to it. > > I had actually shelved the idea because of TSO. But if you can get me > some data f

Re: Van Jacobson net channels

2006-02-03 Thread Andi Kleen
On Friday 03 February 2006 02:07, Greg Banks wrote: > > (Don't ask for code - it's not really in an usable state) > > Sure. I'm looking forward to it. I had actually shelved the idea because of TSO. But if you can get me some data from your NFS servers that shows TSO is not enough for them that

Re: Van Jacobson net channels

2006-02-02 Thread David S. Miller
From: Greg Banks <[EMAIL PROTECTED]> Date: Fri, 03 Feb 2006 12:08:54 +1100 > So, given 2.6.16 on tg3 hardware, would your advice be to > enable TSO by default? Yes. In fact I've been meaning to discuss with Michael Chan enabling it in the driver by default. - To unsubscribe from this list: send

RE: Van Jacobson net channels

2006-02-02 Thread Greg Banks
On Fri, 2006-02-03 at 01:41, Leonid Grossman wrote: > > As I mentioned earlier, it would be cool to get these moderation > tresholds from NAPI, since it can make a better guess about the overall > system utilization than the driver can. Agreed. > But even at the driver level, > this works reas

Re: Van Jacobson net channels

2006-02-02 Thread Greg Banks
On Thu, 2006-02-02 at 18:51, David S. Miller wrote: > From: Greg Banks <[EMAIL PROTECTED]> > Date: Thu, 02 Feb 2006 18:31:49 +1100 > > > On Thu, 2006-02-02 at 17:45, Andi Kleen wrote: > > > Normally TSO was supposed to fix that. > > > > Sure, except that the last time SGI looked at TSO it was >

Re: Van Jacobson net channels

2006-02-02 Thread Greg Banks
On Thu, 2006-02-02 at 18:48, Andi Kleen wrote: > On Thursday 02 February 2006 08:31, Greg Banks wrote: > > > [...]SGI's solution is do is ship a script that uses ethtool > > at boot to tune rx-usecs, rx-frames, rx-usecs-irq, rx-frames-irq > > up from the defaults. > > All user tuning like this is

RE: Van Jacobson net channels and NIC channels

2006-02-02 Thread Leonid Grossman
> -Original Message- > From: Andi Kleen [mailto:[EMAIL PROTECTED] > Why are you saying it can't be used by the host? The stack > should be fully ready for it. Sorry, I should have said "it can't be used by the host to the full potential of the feature" :-). It does work for us now, a

Re: Van Jacobson net channels

2006-02-02 Thread Rick Jones
Andi Kleen wrote: On Thursday 02 February 2006 08:31, Greg Banks wrote: The tg3 driver uses small hardcoded values for the RXCOL_TICKS and RXMAX_FRAMES registers, and allows "ethtool -C" to change them. SGI's solution is do is ship a script that uses ethtool at boot to tune rx-usecs, rx-frame

RE: Van Jacobson net channels

2006-02-02 Thread Robert Olsson
Leonid Grossman writes: > Right. Interrupt moderation is done on per channel basis. > The only addition to the current NAPI mechanism I'd like to see is to > have NAPI setting desired interrupt rate (once interrupts are ON), > rather than use an interrupt per packet or a driver default. Argu

Re: Van Jacobson net channels

2006-02-02 Thread Rick Jones
Oh you have TSO disabled? That explains a lot. Yes, it's been a bumpy road, and there are still some e1000 lockups, but in general things should be smooth these days. I suspect that "these days" in kernel.org terms differs somewhat from "these days" RH/SuSE/etc terms, hence TSO being disabled

Re: Van Jacobson net channels

2006-02-02 Thread Stephen Hemminger
On Wed, 01 Feb 2006 16:29:11 -0800 (PST) "David S. Miller" <[EMAIL PROTECTED]> wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Wed, 1 Feb 2006 16:12:14 -0800 > > > The bigger problem I see is scalability. All those mmap rings have to > > be pinned in memory to be useful. It's fine f

Re: Van Jacobson net channels and NIC channels

2006-02-02 Thread Andi Kleen
On Thursday 02 February 2006 17:27, Leonid Grossman wrote: > By now we have submitted UFO, MSI-X and LRO patches. The one item on > the TODO list that we did not submit a full driver patch for is the > "support for distributing receive processing across multiple CPUs (using > NIC hw queues)", mai

RE: Van Jacobson net channels

2006-02-02 Thread Leonid Grossman
> -Original Message- > From: Eric W. Biederman [mailto:[EMAIL PROTECTED] > How do you classify channels? Multiple rx steering criterias are available, for example tcp tuple (or subset) hash, direct tcp tuple (or subset) match, MAC address, pkt size, vlan tag, QOS bits, etc. > > If

Van Jacobson net channels and NIC channels

2006-02-02 Thread Leonid Grossman
Thanks to Andi, Dave, Jeff and everyone who responded to the original query; I've got enough pointers to presentations, blogs and ideas to keep me busy for a while :-) VJ channels indeed seem to compliment and take to a different level some sw and hw ideas on Dave's TODO list. By now we have subm

Re: Van Jacobson net channels

2006-02-02 Thread Stephen Hemminger
On Thu, 02 Feb 2006 08:35:28 -0700 [EMAIL PROTECTED] (Eric W. Biederman) wrote: > "Christopher Friesen" <[EMAIL PROTECTED]> writes: > > > Eric W. Biederman wrote: > >> Jeff Garzik <[EMAIL PROTECTED]> writes: > > > >>> This was discussed on the netdev list, and the conclusion was that > >>> you wa

Re: Van Jacobson net channels

2006-02-02 Thread Eric W. Biederman
"Leonid Grossman" <[EMAIL PROTECTED]> writes: > There two facilities (at least, in our ASIC, but there is no reason this > can't be part of the generic multi-channel driver interface that I will > get to shortly) to deal with it. > > - hardware supports more than one utilization-based interrupt ra

Re: Van Jacobson net channels

2006-02-02 Thread Eric W. Biederman
"Christopher Friesen" <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: >> Jeff Garzik <[EMAIL PROTECTED]> writes: > >>> This was discussed on the netdev list, and the conclusion was that >>> you want both NAPI and hw mitigation. This was implemented in a >>> few drivers, at least. > >> How

RE: Van Jacobson net channels

2006-02-02 Thread Leonid Grossman
> -Original Message- > From: Andi Kleen [mailto:[EMAIL PROTECTED] > > You just need to make sure that you don't leak data from > other peoples > > sockets. > > There are three basic ways I can see to do this: > > - You have really advanced hardware which can potentially > manage

Re: Van Jacobson net channels

2006-02-02 Thread Christopher Friesen
Eric W. Biederman wrote: Jeff Garzik <[EMAIL PROTECTED]> writes: This was discussed on the netdev list, and the conclusion was that you want both NAPI and hw mitigation. This was implemented in a few drivers, at least. How does that deal with the latency that hw mitigation introduces. When

RE: Van Jacobson net channels

2006-02-02 Thread Leonid Grossman
> -Original Message- > From: Eric W. Biederman [mailto:[EMAIL PROTECTED] > Sent: Thursday, February 02, 2006 4:29 AM > To: Jeff Garzik > Cc: Andi Kleen; Greg Banks; David S. Miller; Leonid Grossman; > [EMAIL PROTECTED]; Linux Network Development list > Subject:

Re: Van Jacobson net channels

2006-02-02 Thread Eric W. Biederman
Jeff Garzik <[EMAIL PROTECTED]> writes: > Andi Kleen wrote: >> There was already talk some time ago to make NAPI drivers use >> the hardware mitigation again. The reason is when you have > > > This was discussed on the netdev list, and the conclusion was that you want > both > NAPI and hw mitigat

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Greg Banks <[EMAIL PROTECTED]> Date: Thu, 02 Feb 2006 18:31:49 +1100 > On Thu, 2006-02-02 at 17:45, Andi Kleen wrote: > > Normally TSO was supposed to fix that. > > Sure, except that the last time SGI looked at TSO it was > extremely flaky. I gather that's much better now, but TSO > still

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 08:31, Greg Banks wrote: > The tg3 driver uses small hardcoded values for the RXCOL_TICKS > and RXMAX_FRAMES registers, and allows "ethtool -C" to change > them. SGI's solution is do is ship a script that uses ethtool > at boot to tune rx-usecs, rx-frames, rx-usecs-ir

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:50, David S. Miller wrote: > > Why not concentrate your thinking on how to make it can be made to > _work_ instead of punching holes in the idea? Isn't that more > productive? What I think would be very practical to do would be to try to replace the socket rx que

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:08, Jeff Garzik wrote: > Definitely not. POSIX AIO is far more complex than the operation > requires, Ah, I sense strong a NIH field. > and is particularly bad for implementations that find it wise > to queue a bunch of to-be-filled buffers. Why? lio_listio se

Re: Van Jacobson net channels

2006-02-01 Thread Greg Banks
On Thu, 2006-02-02 at 17:45, Andi Kleen wrote: > There was already talk some time ago to make NAPI drivers use > the hardware mitigation again. The reason is when you have > a workload that runs below overload and doesn't quite > fill the queues and is a bit bursty, then NAPI tends to turn > on/o

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Andi Kleen wrote: There was already talk some time ago to make NAPI drivers use the hardware mitigation again. The reason is when you have This was discussed on the netdev list, and the conclusion was that you want both NAPI and hw mitigation. This was implemented in a few drivers, at least

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 07:49, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Thu, 2 Feb 2006 07:45:26 +0100 > > > Don't think it was ever implemented though. In the end we just > > eat the slowdown in that particular load. > > The tg3 driver uses the chip interrupt miti

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 00:37, Mitchell Blank Jr wrote: > Jeff Garzik wrote: > > Once packets classified to be delivered to a specific local host socket, > > what further operations are require privs? What received packet data > > cannot be exposed to userspace? > > You just need to make sure

RE: Van Jacobson net channels

2006-02-01 Thread Leonid Grossman
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen > Sent: Wednesday, February 01, 2006 10:45 PM > There was already talk some time ago to make NAPI drivers use > the hardware mitigation again. The reason is when you have a > workload

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 04:19, Greg Banks wrote: > On Thu, 2006-02-02 at 14:13, David S. Miller wrote: > > From: Greg Banks <[EMAIL PROTECTED]> > > Date: Thu, 02 Feb 2006 14:06:06 +1100 > > > > > On Thu, 2006-02-02 at 13:46, David S. Miller wrote: > > > > I know SAMBA is using sendfile() (when

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]> Date: Thu, 2 Feb 2006 07:45:26 +0100 > Don't think it was ever implemented though. In the end we just > eat the slowdown in that particular load. The tg3 driver uses the chip interrupt mitigation to help deal with the SGI NUMA issues resulting from NAPI. - To

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Thursday 02 February 2006 02:53, Greg Banks wrote: > On Thu, 2006-02-02 at 08:11, David S. Miller wrote: > > Van is not against NAPI, in fact he's taking NAPI to the next level. > > Softirq handling is overhead, and as this work shows, it is totally > > unnecessary overhead. > > I got the impres

Re: Van Jacobson net channels

2006-02-01 Thread Greg Banks
On Thu, 2006-02-02 at 14:32, David S. Miller wrote: > I see. > > Maybe we can be smarter about how the write(), CORK, sendfile, > UNCORK sequence is done. >From the NFS server's point of view, the ideal interface would be to pass an array of {page,offset,len} tuples, covering up to around 1 MiB+1

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Greg Banks <[EMAIL PROTECTED]> Date: Thu, 02 Feb 2006 14:19:43 +1100 > Multiple trips down through TCP, qdisc, and the driver for each > NFS packet sent: one for the header and one for each page. Lots > of locks need to be taken and dropped, all this while multiple nfds > on multiple CPUs a

Re: Van Jacobson net channels

2006-02-01 Thread Greg Banks
On Thu, 2006-02-02 at 14:13, David S. Miller wrote: > From: Greg Banks <[EMAIL PROTECTED]> > Date: Thu, 02 Feb 2006 14:06:06 +1100 > > > On Thu, 2006-02-02 at 13:46, David S. Miller wrote: > > > I know SAMBA is using sendfile() (when the client has the oplock held, > > > which basically is "always

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Greg Banks <[EMAIL PROTECTED]> Date: Thu, 02 Feb 2006 14:06:06 +1100 > On Thu, 2006-02-02 at 13:46, David S. Miller wrote: > > I know SAMBA is using sendfile() (when the client has the oplock held, > > which basically is "always"), is NFS doing so as well? > > NFS is an in-kernel server, an

Re: Van Jacobson net channels

2006-02-01 Thread Greg Banks
On Thu, 2006-02-02 at 13:46, David S. Miller wrote: > I know SAMBA is using sendfile() (when the client has the oplock held, > which basically is "always"), is NFS doing so as well? NFS is an in-kernel server, and uses sock->ops->sendpage directly. > Van does have some ideas in mind for TX net ch

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Greg Banks <[EMAIL PROTECTED]> Date: Thu, 02 Feb 2006 12:53:14 +1100 > I got the impression that his code was dynamically changing the > e1000 interrupt mitigation registers in response to load, in > other words using the capabilities of the hardware in a way that > NAPI deliberately avoids

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
David S. Miller wrote: From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 17:32:24 -0800 How large is "the bulk?" The prequeue is always enabled when the app has blocked on read(). Actually I meant in terms of percentage of the cycles to process the packet rather than frequency

Re: Van Jacobson net channels

2006-02-01 Thread Greg Banks
On Thu, 2006-02-02 at 08:11, David S. Miller wrote: > Van is not against NAPI, in fact he's taking NAPI to the next level. > Softirq handling is overhead, and as this work shows, it is totally > unnecessary overhead. I got the impression that his code was dynamically changing the e1000 interrupt m

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 17:32:24 -0800 > How large is "the bulk?" The prequeue is always enabled when the app has blocked on read(). > > Ie. ACK goes out as fast as we can context switch > >to the app receiving the data. This feedback makes all senders >

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
Maybe I'm not sufficiently clued-in, but in broad handwaving terms, it seems today that all three can be taking place in parallel for a given TCP connection. The application is doing its application-level thing on request N on one CPU, while request N+1 is being processed by TCP on another CPU, w

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 16:39:00 -0800 > My questions are meant to see if something is even a roadblock in > the first place. Fair enough. > Maybe I'm not sufficiently clued-in, but in broad handwaving terms, > it seems today that all three can be taking place

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
David S. Miller wrote: From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 15:50:38 -0800 [ What sucks about this whole thread is that only folks like Jeff and myself are attempting to think and use our imagination to consider how some roadblocks might be overcome ] My question

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Wed, 1 Feb 2006 16:12:14 -0800 > The bigger problem I see is scalability. All those mmap rings have to > be pinned in memory to be useful. It's fine for a single smart application > per server environment, but in real world with many dumb thread m

Re: Van Jacobson net channels

2006-02-01 Thread Stephen Hemminger
On Wed, 01 Feb 2006 15:42:39 -0800 (PST) "David S. Miller" <[EMAIL PROTECTED]> wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Wed, 1 Feb 2006 23:55:11 +0100 > > > On Wednesday 01 February 2006 21:26, Jeff Garzik wrote: > > > Andi Kleen wrote: > > > > But I don't think Van's design is suppo

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Rick Jones <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 15:50:38 -0800 [ What sucks about this whole thread is that only folks like Jeff and myself are attempting to think and use our imagination to consider how some roadblocks might be overcome ] > If the TCP processing is put in the

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
It almost feels like the channel concept wants a "thread per connection" model? No, it means only that your application must be asynchronous -- which all modern network apps are already. The INN model of a single process calling epoll(2) for 800 sockets should continue to work, as should th

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Mitchell Blank Jr <[EMAIL PROTECTED]> Date: Wed, 1 Feb 2006 15:37:04 -0800 > So I agree that this would have to be CAP_NET_ADMIN only. I'm drowning in all of this pessimism folks. Why not concentrate your thinking on how to make it can be made to _work_ instead of punching holes in the ide

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]> Date: Wed, 1 Feb 2006 23:55:11 +0100 > On Wednesday 01 February 2006 21:26, Jeff Garzik wrote: > > Andi Kleen wrote: > > > But I don't think Van's design is supposed to be exposed to user space. > > > > It is supposed to be exposed to userspace AFAICS. > > Th

Re: Van Jacobson net channels

2006-02-01 Thread Mitchell Blank Jr
Jeff Garzik wrote: > Once packets classified to be delivered to a specific local host socket, > what further operations are require privs? What received packet data > cannot be exposed to userspace? You just need to make sure that you don't leak data from other peoples sockets. Two issues I se

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
But people who care about the performance of their networking apps are likely to want to switch over to this new userspace networking API, over the next decade, I think. Yet there needs to be some cross-platform commonality for the API yes? That was the main thrust behind my simplistic aski

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Andi Kleen wrote: On Wednesday 01 February 2006 21:26, Jeff Garzik wrote: Andi Kleen wrote: But I don't think Van's design is supposed to be exposed to user space. It is supposed to be exposed to userspace AFAICS. Then it's likely insecure and root only, unless he knows some magic that w

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Rick Jones wrote: what are the implications for having the application churning away doing application things while TCP is feeding it data? Or for an application that is processing more than one TCP connection in a given thread? It almost feels like the channel concept wants a "thread per con

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Rick Jones wrote: Jeff Garzik wrote: Key point 1: Van's slides align closely with the design that I was already working on, for zero-copy RX. To have a fully async, zero copy network receive, POSIX read(2) is inadequate. Is there an aio_read() in POSIX adequate to the task? Definitel

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 21:26, Jeff Garzik wrote: > Andi Kleen wrote: > > But I don't think Van's design is supposed to be exposed to user space. > > It is supposed to be exposed to userspace AFAICS. Then it's likely insecure and root only, unless he knows some magic that we don't. I hope

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 22:11, David S. Miller wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > Date: Wed, 1 Feb 2006 19:28:46 +0100 > > > http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf > > I did a writeup in my blog about all of this, another good > reason to actively follow my blog

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
At the risk of being told to launch myself towards a body of water... So, sort of linking with the data about saturating a GbE both ways on a single TCP connection, and how it required binding netperf to the CPU other than the one taking interrupts... If channels are taken to their limit, and t

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Jeff Garzik <[EMAIL PROTECTED]> Date: Wed, 01 Feb 2006 14:37:46 -0500 > So, I am not concerned with slideware. These are two good ideas that > are worth pursuing, even if Van produces zero additional output. Right. And, to all of you having trouble imagining how else you'd apply these ne

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]> Date: Wed, 1 Feb 2006 20:50:31 +0100 > On Wednesday 01 February 2006 20:37, Jeff Garzik wrote: > > > To have a fully async, zero copy network receive, POSIX read(2) is > > inadequate. > > Agreed, but POSIX aio is adequate. No, it's a joke. To do this stu

Re: Van Jacobson net channels

2006-02-01 Thread David S. Miller
From: Andi Kleen <[EMAIL PROTECTED]> Date: Wed, 1 Feb 2006 19:28:46 +0100 > http://www.lemis.com/grog/Documentation/vj/lca06vj.pdf I did a writeup in my blog about all of this, another good reason to actively follow my blog: http://vger.kernel.org/~davem/cgi-bin/blog.cgi/index.html Go r

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Andi Kleen wrote: But I don't think Van's design is supposed to be exposed to user space. It is supposed to be exposed to userspace AFAICS. It's still in the kernel, just in process context. Incorrect. Its in the userspace app (though usually via a library). See slides 26 and 27. But i

Re: Van Jacobson net channels

2006-02-01 Thread Arnaldo Carvalho de Melo
On 2/1/06, Andi Kleen <[EMAIL PROTECTED]> wrote: > On Wednesday 01 February 2006 20:37, Jeff Garzik wrote: > > > To have a fully async, zero copy network receive, POSIX read(2) is > > inadequate. > > Agreed, but POSIX aio is adequate. > > > One needs a ring buffer, similar in API to the mmap'd > >

Re: Van Jacobson net channels

2006-02-01 Thread Jonathan Corbet
Andi writes: > But I don't think Van's design is supposed to be exposed to user space. > It's just a better way to implement BSD sockets. Actually, it can, indeed, go all the way to user space - connecting channels to the socket layer was one of the intermediate steps. FWIW, I did an article on

Re: Van Jacobson net channels

2006-02-01 Thread Rick Jones
Jeff Garzik wrote: Key point 1: Van's slides align closely with the design that I was already working on, for zero-copy RX. To have a fully async, zero copy network receive, POSIX read(2) is inadequate. Is there an aio_read() in POSIX adequate to the task? One needs a ring buffer, simila

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 20:37, Jeff Garzik wrote: > To have a fully async, zero copy network receive, POSIX read(2) is > inadequate. Agreed, but POSIX aio is adequate. > One needs a ring buffer, similar in API to the mmap'd > packet socket, where you can queue a whole bunch of reads.

Re: Van Jacobson net channels

2006-02-01 Thread Jeff Garzik
Key point 1: Van's slides align closely with the design that I was already working on, for zero-copy RX. To have a fully async, zero copy network receive, POSIX read(2) is inadequate. One needs a ring buffer, similar in API to the mmap'd packet socket, where you can queue a whole bunch of r

Re: Van Jacobson net channels

2006-02-01 Thread Andi Kleen
On Wednesday 01 February 2006 14:48, Leonid Grossman wrote: > David S. Miller wrote: > > > And with Van Jacobson net channels, none of this is going to > > matter and 512 is going to be your limit whether you like it > > or not. So this short term complexity gain