Hi folks, this is my first post to freebsd-net, and my first bug-fix submission... I hope this is the right mailing list for this issue, and the right format for sending in patches....
I'm working on a derivative of FreeBSD 7. I've run into a problem with IP header checksums when fragmenting to an e1000 (em) interface, and I've narrowed it down to a very simple test. The test setup is like this: [computer A]---(network 1)---[computer B]---(network 2)---[computer C] That gorgeous drawing shows computer A connected to computer B via network 1, and computer B connected to computer C via network 2. Computer B is set up to forward packets between networks 1 and 2. A can see B but not C. C can see B but not A. B forwards between A and C. Pretty simple. One of B's NICs is a Broadcom, handled by the bce driver; this one works fine in all my testing. B's other NIC is an Intel PRO/1000 handled by the em driver. This is the one giving me trouble. The test disables PMTUD on all three hosts. It then sets the MTU of the bce and em interfaces to the unrealistically low value of 72 bytes, and tries to pass TCP packets back and forth using nc on computers A and C (with computer B acting as a gateway). This is to force the B gateway to fragment the TCP frames it forwards. Receiving on the em and sending on the bce works just fine (as noted above). Small TCP frames that fit in the MTU, big TCP frames that get fragmented, no problems. Receiving on the bce and sending on the em interface works fine for small TCP frames that don't need fragmentation, but when B has to fragment the IP packets before sending them out the em, the IP header checksums in the IP packets that appear on the em's wires are wrong. I came to this conclusion by packet capture and by watching the 'bad header checksums' counter of 'netstat -s -p ip', both running on the computer receiving the fragments. Ok, those are all my observations, next comes thoughts about the cause & a proposed fix. The root of the problem is two-fold: 1. ip_output.c:ip_fragment() does not clear the CSUM_IP flag in the mbuf when it does software IP checksum computation, so the mbuf still looks like it needs IP checksumming. 2. The em driver does not advertise IP checksum offloading, but still checks the CSUM_IP flag in the mbuf and modifies the packet when that flag is set (this is in em_transmit_checksum_setup(), called by em_xmit()). Unfortunately the em driver gets the checksum wrong in this case, i guess that's why it doesn't advertise this capability in its if_hwassist! So the fragments that ip_fastfwd.c:ip_fastforward() gets from ip_output.c:ip_fragment() have ip->ip_sum set correctly, but the mbuf->m_pkthdr.csum_flags incorrectly has CSUM_IP still set, and this causes the em driver to emit incorrect packets. There are some other callers of ip_fragment(), notably ip_output(). ip_output() clears CSUM_IP in the mbuf csum_flags itself if it's not in if_hwassist, so avoids this problem. So, the fix is simple: clear the mbuf's CSUM_IP when computing ip->ip_sum in ip_fragment(). The first attached patch (against gitorious/svn_stable_7) does this. In looking at this issue, I noticed that ip_output()'s use of sw_csum is inconsistent. ip_output() splits the mbuf's csum_flags into two parts: the stuff that hardware will assist with (these flags get left in the mbuf) and the stuff that software needs to do (these get moved to sw_csum). But later ip_output() calls functions that don't get sw_csum, or that don't know to look in it and look in the mbuf instead. My second patch fixes these kinds of issues and (IMO) simplifies the code by leaving all the packet's checksumming needs in the mbuf, getting rid of sw_csum entirely. -- Sebastian Kuzminsky Linerate Systems
0001-Update-the-mbuf-csum_flags-of-IP-fragments-when-comp.patch
Description: Binary data
0002-Simplify-the-tracking-of-mbuf-checksumming-needs.patch
Description: Binary data
_______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"