David S. Miller wrote:

From: Mark Butler <[EMAIL PROTECTED]>
Date: Fri, 24 Mar 2006 22:37:26 -0700

On a more general note, I find the idea that a current dst entry doesn't actually reflect the interface (even a logical interface) and nexthop that will be used to deliver a packet a little disturbing. It would seem to me that any filter that is going to re-route a packet to a different address or a different interface should be a logical device (with its own IP address) or logical interface, respectively. Otherwise what is going on is completely invisible to the transport protocol, as well as users of tools like traceroute.

Welcome to firewalls and NAT.
A true firewall should never need to do anything but drop packets and reset connections. Changes to the way packets are routed should be done at the routing layer, using the flow information from the transport layer. Simple firewall rules should be implemented the same way. By the time a dst entry is returned, the need for NF output chain processing should be minimal to non-existent. Serialized processing of every IP packet, whether it needs it or not is ridiculously inefficient. No high capacity router would operate that way. A route decision for a flow would be made once, and data in most flows would use a fast (generally hardware) path without further consideration.

Of course NAT processing only needs to be done on the NF forward chain, not the input or output chains. No need to affect local transport protocols at all. The need for any kind of NF processing should be reflected in the routing tables, and echoed in the dst entry (or dst entry stack). There has been discussion of Van Jacobson style optimization of the input chain. Well the quickest way to optimize the output chain would be to return filtered routing information to the transport layer so that a transport protocol could run its own output processing. For example, why should IPSEC encryption be delayed to the moment of transmission? Why should a re-transmitted packet be re-encrypted? Performance would be improved significantly if a transport protocol could arrange for IPSEC transformations to be done in advance, so that when a congestion window opening ACK arrived, data could be transmitted without further delay. Same deal for retransmissions. IPSEC encryption would then generally occur in the process context of the sender, rather than softirq context at the last possible moment.

Same thing for Neighbor discovery delays and IP fragmentation. Instead of holding a packet somewhere in the IP layer waiting for an ARP reply, the transport driver should just get an appropriate notification. Then it could (for example) bundle additional data into the same packet in the meantime.

Transports could easily hold IP fragments for further processing as well. Some of them (notably DCCP) can profitably make use of IP datagrams with missing segments. Other transports could use the information to make better determinations about congestion and packet loss. In any case IP segmentation and reassembly at the transport layer would be more efficient and would be a straight forward extension of what is already present for anything more sophisticated than UDP.

You don't know anything until the packet is examined by the filter,
because it's impossible to know what rule would be matched until the
packet is actually built, since the rule matching is on packet
contents (such as the source and destination IP addresses, and source
and destination ports, but more obscure mathing is also possible, like
matching by TOS or other IP header flags).
The flowi structure already contains all that information for routing purposes. No reason why it could not be used to do early netfilter reduction as well. Right?

- Mark B.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to