Hi All, I just been upgrading a router from OpenBSD 5.1 to 5.4 and hit a big problem I'm finding that in certain circumstance TCP packets have incorrect checksums. I know some checksum work was done recently, so maybe something has gone awry (or I've missed something simple).
I have OpenVPN listening on a CARP interface which is on top of a VLAN interface, which is on top of a Trunk (LACP) interface on top of a pair of bnx interfaces. The VPN connection itself sets up just fine. It is a bridging setup with tun0 in a bridge group with another vlan interface. Once connected I can ping everything just fine. Packets going out of the router to certain places go through NAT. However TCP connections that go via NAT don't work. TCP connections that take *exactly* the same physical and logic network path, but are not NATd work just fine. Running tcpdump on various interfaces shows that the TCP checksum is invalid by the time the OpenVPN client machine gets it. It is correct when it hits the inbound interface on the OpenBSD 5.4 box. It shows invalid on its way out the tun interface. And it is invalid when it comes out the other end of the VPN tunnel. So my guessing is that when PF is rewriting the headers it is not (correctly) calculating the checksum. The checksum *is* different before and after NAT, so I'm guessing it is attempting to recalculate the checksum but getting it wrong. If that packet was then to go out a physical interface (e.g. bnx) then I guess the NIC would put the correct checksum back on. But as it is instead being sent down a tun interface that it is not getting corrected at all. Does this sound like a likely hypothesis to anyone who knows the changes that were made? -Matt