Well, a few more hours of running, and it's fairly easy to catch the packets with tcpdump, but not as easy to see if there is a pattern to them or what is different about them from the other packets that do pass with normal sizes.
I'm using: tcpdump -ennvvvSuxx -i ix0 -s 64 greater 65495 here's some output. 18:41:41.311025 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65502: (tos 0x0, ttl 64, id 37273, offset 0, flags [DF], proto TCP (6), length 65488, bad cksum 0 (->50ee)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq 3009729118:3009794554, ack 3477042952, win 28478, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:42:11.284028 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65502: (tos 0x0, ttl 64, id 52388, offset 0, flags [DF], proto TCP (6), length 65488, bad cksum 0 (->15e3)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [.], seq 1533469358:1533534794, ack 478673276, win 29127, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:42:31.385082 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65498: (tos 0x0, ttl 64, id 25808, offset 0, flags [DF], proto TCP (6), length 65484, bad cksum 0 (->7dbb)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq 3658906462:3658971894, ack 1460462120, win 29127, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:42:45.200094 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65502: (tos 0x0, ttl 64, id 43985, offset 0, flags [DF], proto TCP (6), length 65488, bad cksum 0 (->36b6)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq 805280454:805345890, ack 2122788052, win 29127, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:43:16.601738 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65502: (tos 0x0, ttl 64, id 5657, offset 0, flags [DF], proto TCP (6), length 65488, bad cksum 0 (->cc6e)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [.], seq 3978046962:3978112398, ack 3596907688, win 29127, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:43:37.345685 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65506: (tos 0x0, ttl 64, id 41062, offset 0, flags [DF], proto TCP (6), length 65492, bad cksum 0 (->421d)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq 1419570518:1419635958, ack 104148460, win 29127, options [nop,nop,TS[|tcp]> 0x0000: 0050 567d b8ff 001b 21d6 4c4c 0800 4500 18:45:50.266944 00:1b:21:d6:4c:4c > 00:50:56:7d:b8:ff, ethertype IPv4 (0x0800), length 65506: (tos 0x0, ttl 64, id 5853, offset 0, flags [DF], proto TCP (6), length 65492, bad cksum 0 (->cba6)!) 172.16.0.30.2049 > 172.16.0.97.947: Flags [P.], seq 2161102562:2161168002, ack 2086338240, win 29127, options [nop,nop,TS[|tcp]> With the IP_MAXPACKET = 65495, I've had zero problems with networking. On Mon, Mar 24, 2014 at 1:23 PM, Christopher Forgeron <csforge...@gmail.com>wrote: > I think making hw_tsomax a sysctl would be a good patch to commit - It > could enable easy debugging/performance testing for the masses. > > I'm curious to hear how your environment is working with a tso turned off > on your nics. > > My testbed just hit the 2 hour mark. With TSO off, I don't get a single > packet over IP_MAXPACKET. That puts my confidence at around 95% in the > statement 'turning off tso negates this issue for me'. > > I'm now rebooting into a +tso env to see if I can capture the bad packets. > > I am also sure that the netstat -m mbuf denied is a completely separate > issue. I'm going around the lab and powering up different boxes with > 10.0-RELEASE, and they all have mbuf/mbuf clusters denied on boot, and that > number increases with network traffic. It's probably not helping the > IP_MAXPACKET issue. > > > > I'll create a separate thread for that one shortly. > > > On Mon, Mar 24, 2014 at 1:14 PM, Markus Gebert <markus.geb...@hostpoint.ch > > wrote: > >> >> On 24.03.2014, at 16:21, Christopher Forgeron <csforge...@gmail.com> >> wrote: >> >> > This is regarding the TSO patch that Rick suggested earlier. (With many >> > thanks for his time and suggestion) >> > >> > As I mentioned earlier, it did not fix the issue on a 10.0 system. It >> did >> > make it less of a problem on 9.2, but either way, I think it's not >> needed, >> > and shouldn't be considered as a patch for testing/etc. >> > >> > Patching TSO to anything other than a max value (and by default the code >> > gives it IP_MAXPACKET) is confusing the matter, as the packet length >> > ultimately needs to be adjusted for many things on the fly like TCP >> > Options, etc. Using static header sizes won't be a good idea. >> > >> > Additionally, it seems that setting nic TSO will/may be ignored by code >> > like this in sys/netinet/tcp_output.c: >> > >> > 10.0 Code: >> > >> > 780 if (len > tp->t_tsomax - hdrlen) >> > { !! >> > 781 len = tp->t_tsomax - >> > hdrlen; !! >> > 782 sendalot = >> > 1; >> > 783 } >> > >> > >> > I've put debugging here, set the nic's max TSO as per Rick's patch ( >> set to >> > say 32k), and have seen that tp->t_tsomax == IP_MAXPACKET. It's being >> set >> > someplace else, and thus our attempts to set TSO on the nic may be in >> vain. >> > >> > It may have mattered more in 9.2, as I see the code doesn't use >> > tp->t_tsomax in some locations, and may actually default to what the >> nic is >> > set to. >> > >> > The NIC may still win, I didn't walk through the code to confirm, it was >> > enough to suggest to me that setting TSO wouldn't fix this issue. >> >> >> I just applied Rick's ixgbe TSO patch and additionally wanted to be able >> to easily change the value of hw_tsomax, so I made a sysctl out of it. >> >> While doing that, I asked myself the same question. Where and how will >> this value actually be used and how comes that tcp_output() uses that other >> value in struct tcpcb. >> >> The only place tcpcb->t_tsomax gets set, that I have found so far, is in >> tcp_input.c's tcp_mss() function. Some subfunctions get called: >> >> tcp_mss() -> tcp_mss_update() -> tcp_maxmtu() >> >> Then tcp_maxmtu() indeed uses the interface's hw_tsomax value: >> >> 1746 cap->tsomax = ifp->if_hw_tsomax; >> >> It get's passed back to tcp_mss() where it is set on the connection >> level which will be used in tcp_output() later on. >> >> tcp_mss() gets called from multiple places, I'll look into that later. I >> will let you know if I find out more. >> >> >> Markus >> >> >> > However, this is still a TSO related issue, it's just not one related to >> > the setting of TSO's max size. >> > >> > A 10.0-STABLE system with tso disabled on ix0 doesn't have a single >> packet >> > over IP_MAXPACKET in 1 hour of runtime. I'll let it go a bit longer to >> > increase confidence in this assertion, but I don't want to waste time on >> > this when I could be logging problem packets on a system with TSO >> enabled. >> > >> > Comments are very welcome.. >> > _______________________________________________ >> > freebsd-net@freebsd.org mailing list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-net >> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > >> >> > _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"