Rick Jones wrote:
Jeff Garzik wrote:
Key point 1:
Van's slides align closely with the design that I was already working
on, for zero-copy RX.
To have a fully async, zero copy network receive, POSIX read(2) is
inadequate.
Is there an aio_read() in POSIX adequate to the task?
Definitely not. POSIX AIO is far more complex than the operation
requires, and is particularly bad for implementations that find it wise
to queue a bunch of to-be-filled buffers. Further, the current
implementation of POSIX AIO uses a thread for almost every I/O, which is
yet more overkill.
A simple mmap'd ring buffer is much closer to how the hardware actually
behaves. It's no surprise that the "ring buffer / doorbell" pattern
pops up all over the place in computing these days.
Are you speaking strictly in the context of a single TCP connection, or
for multiple TCP connections? For the latter getting out of the kernel
multiple
isn't a priori a requirement. Actually, I'm not even sure it is a
priori a requirement for the former?
Getting the TCP receive path out of the kernel isn't a requirement, just
an improvement.
You'll always have to have a basic path for existing applications that
do normal read(2) and write(2). You can't break something that fundamental.
But people who care about the performance of their networking apps are
likely to want to switch over to this new userspace networking API, over
the next decade, I think.
Jeff
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html