On Thu, 2006-04-27 at 14:12 -0700, Caitlin Bestler wrote:
> So the real issue is when there is an intelligent device that
> uses hardware packet classification to place the packet in
> the correct ring. We don't want to bypass packet filtering,
> but it would be terribly wasteful to reclassify the packet.
> Intelligent NICs will have packet classification capabilities
> to support RDMA and iSCSI. Those capabilities should be available
> to benefit SOCK_STREAM and SOCK_DGRAM users as well without it
> being a choice of either turning all stack control over to
> the NIC or ignorign all NIC capabilities beyound pretending
> to be a dumb Ethernet NIC.
> 
> For example, counting packets within an approved connection
> is a valid goal that the final solution should support. But
> would a simple count be sufficient, or do we truly need the
> full flexibility currently found in netfilter?

Note that the problem space AFAICT includes strange advanced routing
setups, ingress qos and possibly others, not just netfilter.  But
perhaps the same solutions apply, so I'll concentrate on nf.

If we start with a "disable direct netchannels when netfilter hooks
registered", we would inevitably refine it to "disable some netchannels
when netfilter hooks registered".  The worst case for this filtering
based on connection tracking, with its constantly changing effects as
things time out.  Hard problem.

Is it time to re-examine the Grand Unified Lookup which Dave mentions
every few years? 8)

> My assumption
> is that each input ring has a matching output ring, and that
> the output ring cannot be used to send packets that would
> not be matched by the reverse rule for the paired input ring.
> So the information that supports enforcing that rule needs
> to be stored somewhere other than the ring itself.

Ah, this is a different problem.  Our idea was to have a syscall which
would check & sanitize the buffers for output.  To do this, you need the
ability to chain buffers (a simple next entry in the header, for us).

Sanitization would copy the header into a global buffer (ie. not one
reachable by userspace), check the flowid, and chain on the rest of the
user buffer.  After it had sanitized the buffers, it would activate the
NIC, which would only send out buffers which started with a kernel
buffer.

Of course, the first step (CAP_NET_RAW-only) wouldn't need this.  And,
if the "sanitize_and_send" syscall were PF_VJCHAN's write(), then the
contents of the write() could actually be the header: userspace would
never deal with chained buffers.

Finally, it's not clear how one should sanely mix this with sendfile
etc.  Maybe you don't, and only use this for RDMA, etc.

Cheers!
Rusty.
-- 
 ccontrol: http://ozlabs.org/~rusty/ccontrol

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to