Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
On 11/8/2011 11:00 PM, Adrian Chadd wrote: On 8 November 2011 09:21, Hooman Fazaeli wrote: With MSIX enabled, the link task (em_handle_link) does _not_ triggers _start when the link changes state from inactive to active (which it should). If if_snd quickly fills up during a temporary link loss, transmission is stopped forever and the driver never recovers from that state. The last patch should have reduced the frequency of the problem but it assumes every IFQ_ENQUEUE is followed by a if_start which is not a true assumption. FWIW, I saw something very similar with the if_arge code port from Linux. If the TX queue filled up and wasn't serviced before it hit completely full, it was never drained. It may be worthwhile auditing some of the other NIC drivers to ensure this kind of situation isn't occuring. Especially if they came from Linux. :-) That's a great catch, I hope it finally fixes the if_em issues with MSIX. :-) Adrian Just for the record, I should inform you that igb, ixgb and ixbge have the same issue. I have not checked other drivers. And there is another subtle problem with all these drivers: if transmit (xxx_xmit) fails for a temporary memory shortage (i.e., DMA failure for ENOMEM), the driver may enter the OACTIVE state and _never_ recovers! The scenario is somehow as before: - if_start is executed. - xxx_xmit fails with ENOMEM. - xxx_start_locked sets OACTIVE. Note that this is different from a low TX descriptor condition which also sets OACTIVE. - stack enqueues packets in if_snd but does not call if_start since driver is OACTIVE. - stack enqueues more packets until if_snd fills up and packets start to drop. - Since there is nowhere in the driver's code to re-try transmission when memory becomes available again (xxx_local_timer is a candidate), the driver remains OACTIVE forever until it is re-initialized. I am working on patches for em/igb/ixgb/ixgbe to fix these issues and would be happy to share them with anyone who is interested. since these are really severe problems, I hope gurus apply official fixes ASAP. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
On 11/8/2011 10:23 PM, Jason Wolfe wrote: On Tue, Nov 8, 2011 at 10:21 AM, Hooman Fazaeli mailto:hoomanfaza...@gmail.com>> wrote: I have allocated more time to the problem and guess I can explain what your problem is. With MSIX disabled, the driver uses fast interrupt handler (em_irq_fast) which calls rx/tx task and then checks for link status change. This implies that rx/tx task is executed with every link state change. This is not efficient, as it is a waste of time to start transmission when link is down. However, it has the effect that after a temporary link loss (active->inactive->active), _start is executed and transmission continues normally. The value of link_toggles (3) clearly indicates that you had such a transition when the problem occured. With MSIX enabled, the link task (em_handle_link) does _not_ triggers _start when the link changes state from inactive to active (which it should). If if_snd quickly fills up during a temporary link loss, transmission is stopped forever and the driver never recovers from that state. The last patch should have reduced the frequency of the problem but it assumes every IFQ_ENQUEUE is followed by a if_start which is not a true assumption. If you are willing to test, I can prepare another patch for you to fix the issue in a different and more reliable way. Hooman, Thanks again for the assist, it sounds like this may also be why we see a bit higher latency with MSI-X disabled on this chipset. I'm happy to test any patches as I have a handful of boxes set aside to 'research' this issue. Hopefully the testing here helps along any patches to the tree for others benefit also. Jason Latency may or may not be related. I am doing more tests and will post my findings soon. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Possible MROUTING regression in 9.0 RC1
Crashes is often, even on different hardware (another server) [root@timp ~]# /usr/local/sbin/igmpproxy -dvv /usr/local/etc/igmpproxy.conf Searching for config file at '/usr/local/etc/igmpproxy.conf' Config: Quick leave mode enabled. Config: Got a phyint token. Config: IF: Config for interface re0. Config: IF: Got upstream token. Config: IF: Got ratelimit token '0'. Config: IF: Got threshold token '1'. Config: IF: Got altnet token 172.31.242.0/24. Config: IF: Altnet: Parsed altnet to 172.31.242/24. IF name : re0 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 1 Allowednet ptr : c71040 Config: Got a phyint token. Config: IF: Config for interface bridge0. Config: IF: Got downstream token. Config: IF: Got ratelimit token '0'. Config: IF: Got threshold token '1'. IF name : bridge0 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 2 Allowednet ptr : 0 Config: Got a phyint token. Config: IF: Config for interface lo0. Config: IF: Got disabled token. IF name : lo0 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 0 Allowednet ptr : 0 Config: Got a phyint token. Config: IF: Config for interface plip0. Config: IF: Got disabled token. IF name : plip0 Next ptr : 0 Ratelimit : 0 Threshold : 1 State : 0 Allowednet ptr : 0 buildIfVc: Interface re0 Addr: 10.85.13.39, Flags: 0x8843, Network: 10.85.13/24 buildIfVc: Interface lo0 Addr: 127.0.0.1, Flags: 0x8049, Network: 127/8 buildIfVc: Interface bridge0 Addr: 172.16.254.1, Flags: 0x8843, Network: 172.16.254/24 Found config for re0 Found config for bridge0 adding VIF, Ix 0 Fl 0x0 IP 0x270d550a re0, Threshold: 1, Ratelimit: 0 Network for [re0] : 10.85.13/24 Network for [re0] : 172.31.242/24 adding VIF, Ix 1 Fl 0x0 IP 0x01fe10ac bridge0, Threshold: 1, Ratelimit: 0 Network for [bridge0] : 172.16.254/24 Got 262144 byte buffer size in 0 iterations Joining all-routers group 224.0.0.2 on vif 172.16.254.1 joinMcGroup: 224.0.0.2 on bridge0 SENT Membership query from 172.16.254.1to 224.0.0.1 Sent membership query from 172.16.254.1 to 224.0.0.1. Delay: 10 Created timeout 1 (#0) - delay 10 secs (Id:1, Time:10) Created timeout 2 (#1) - delay 21 secs (Id:1, Time:10) (Id:2, Time:21) received packet from 172.16.254.1 shorter (28 bytes) than hdr+data length (20+28) received packet from 172.16.254.1 shorter (32 bytes) than hdr+data length (24+32) About to call timeout 1 (#0) Aging routes in table. Current routing table (Age active routes): - No routes in table... - received packet from 10.85.13.5 shorter (28 bytes) than hdr+data length (20+28) ^Cselect() failure; Errno(4): Interrupted system call Got a interupt signal. Exiting. clean handler called All routes removed. Routing table is empty. Shutdown complete 2011/11/8 Pavel Timofeev : > And sometimes igmpproxy's shutdown lead to crash of my system. > Without any panics, it just reboots. oO > > 2011/11/7 Pavel Timofeev : >> Hello! I have problems with ip_mroute (loaded as module) - kernel >> multicast packet forwarder. >> I have 2 disk: freebsd 8.2 release amd64 on first and freebsd 9.0 rc1 on >> second. >> I use net/igmpproxy to watch IPTV on my home atom-based router. >> >> On FreeBSD 8.2 it works good. >> >> But when I try to use FreeBSD 9.0 RC-1 in same role (with same >> configs, of cource) I have messages like: >> Nov 7 16:16:46 timp igmpproxy[35495]: received packet from >> 172.16.254.1 shorter (28 bytes) than hdr+data length (20+28) >> Nov 7 16:16:47 timp igmpproxy[35495]: received packet from >> 172.16.254.1 shorter (32 bytes) than hdr+data length (24+32) >> Nov 7 16:17:28 timp igmpproxy[35495]: received packet from 10.85.13.5 >> shorter (28 bytes) than hdr+data length (20+28) >> And IPTV doesn't work =( >> >> Any ideas? >> Do you need configs? >> > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
There's no locking around the OACTIVE flag set/clear, right? Is it possible that multiple TX threads are fiddling with OACTIVE and then it's not being properly cleared and tx kicked? Adrian ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
ipf(8) for TCP rate limiting
Hi. My machine has some ipf(8) rules and I see that when there is a TCP connection storm to the http port the filer sends out TCP resets. I wanted to know if its possible to configure the pps limit for TCP connections before the RSTs kick in using ipf. regards, vijay ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Hmmm, that's an interesting point Adrian, I'll look at that more closely. Jack On Wed, Nov 9, 2011 at 4:09 PM, Adrian Chadd wrote: > There's no locking around the OACTIVE flag set/clear, right? > Is it possible that multiple TX threads are fiddling with OACTIVE and > then it's not being properly cleared and tx kicked? > > > Adrian > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
BTW, the new delta on the driver is coming, I just ran into some issues with the validation testing done in house and I've had to iron a few things out. I am going to implement Hooman's idea of a TX clean from local_timer, that seems like a good idea. The other thing I'm doing right now is reenabling the MULTIQUEUE define and looking at 82574 performance, once I did that I found certain pieces that needed tweaking. The jury is still out on whether or not this is worth doing, but I'm making it possible for people to try for themselves. Anyone that really wants to try this driver early might want to send me some directed email. Jack On Wed, Nov 9, 2011 at 9:00 PM, Jack Vogel wrote: > Hmmm, that's an interesting point Adrian, I'll look at that more closely. > > Jack > > > > On Wed, Nov 9, 2011 at 4:09 PM, Adrian Chadd wrote: > >> There's no locking around the OACTIVE flag set/clear, right? >> Is it possible that multiple TX threads are fiddling with OACTIVE and >> then it's not being properly cleared and tx kicked? >> >> >> Adrian >> > > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: FreeBSD 9 and ARP multicast source address error messages
Alexander, On Tue, Nov 08, 2011 at 05:14:45PM -0500, Alexander Wittig wrote: A> I upgraded one of my machines from FreeBSD 8 to 9.0-RC1 (FreeBSD bt.pa.msu.edu 9.0-RC1 FreeBSD 9.0-RC1 #3: Fri Oct 28 16:45:28 EDT 2011 r...@bt.pa.msu.edu:/usr/obj/usr/src/sys/ALEX i386), and ever since that upgrade the kernel keeps flooding my log files with messages like these: A> Nov 7 16:40:01 bt kernel: in_arp: source hardware address is multicast.in_arp: source hardware address is multicast. A> Nov 7 16:42:02 bt kernel: in_arp: source hardware address is multicast.in_arp: source hardware address is multicast. A> A> A Google search for these didn't reveal any useful results as to why this happens or how to fix it. So I did a tcpdump and matched the time stamps with packets, and I found the ones causing problems (the only ones with a multicast bit set) to be like this: A> 16:40:01.099823 02:02:23:09:44:3c > 03:bf:23:09:44:87, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 35.9.68.228 is-at 03:bf:23:09:44:e4, length 46 A> 0x: 03bf 2309 4487 0202 2309 443c 0806 0001 A> 0x0010: 0800 0604 0002 03bf 2309 44e4 2309 44e4 A> 0x0020: 02bf 2309 443c 2309 4487 A> 0x0030: A> A> It appears the broadcast MAC 03:bf:23:09:44:87 is part of Microsoft's network load balancing mechanism, with the 03:bf indicating that much and the 23:09:44:87 containing the IP address of the load balance cluster (35.9.68.228). These types of MACs seem to be commonly used in their load balancing implementation. A> A> To prevent these messages from producing thousands of lines of logs each day, I added the following two IPFW rules and enabled ethernet package filtering (sysctl net.link.ether.ipfw=1): A> deny ip from any to any MAC 03:bf:00:00:00:00/16 any layer2 A> allow ip from any to any layer2 A> A> This effectively blocks those packages and the resulting error messages. But I'm wondering if the newly added(?) ARP code in FBSD 9 is a bit too fussy about these, or if MS is abusing the ARP protocol here. Either way, this was never a problem with FBSD 7 or 8. Can you try attached patch. It reduces severity level of all ARP messages, that can be triggered by packet on network, with expection to "using my IP address". With default syslog.conf, now ARP errors won't get to console. Also, the multicast message lacked "\n" newline character, that's why, I suppose, syslogd failed to coalesce a number of messages into a single entry "last message repeated X times". -- Totus tuus, Glebius. Index: if_ether.c === --- if_ether.c (revision 227416) +++ if_ether.c (working copy) @@ -433,7 +433,7 @@ if (m->m_len < sizeof(struct arphdr) && ((m = m_pullup(m, sizeof(struct arphdr))) == NULL)) { - log(LOG_ERR, "arp: runt packet -- m_pullup failed\n"); + log(LOG_NOTICE, "arp: runt packet -- m_pullup failed\n"); return; } ar = mtod(m, struct arphdr *); @@ -443,7 +443,7 @@ ntohs(ar->ar_hrd) != ARPHRD_ARCNET && ntohs(ar->ar_hrd) != ARPHRD_IEEE1394 && ntohs(ar->ar_hrd) != ARPHRD_INFINIBAND) { - log(LOG_ERR, "arp: unknown hardware address format (0x%2D)\n", + log(LOG_NOTICE, "arp: unknown hardware address format (0x%2D)\n", (unsigned char *)&ar->ar_hrd, ""); m_freem(m); return; @@ -451,7 +451,7 @@ if (m->m_len < arphdr_len(ar)) { if ((m = m_pullup(m, arphdr_len(ar))) == NULL) { - log(LOG_ERR, "arp: runt packet\n"); + log(LOG_NOTICE, "arp: runt packet\n"); m_freem(m); return; } @@ -527,7 +527,7 @@ req_len = arphdr_len2(ifp->if_addrlen, sizeof(struct in_addr)); if (m->m_len < req_len && (m = m_pullup(m, req_len)) == NULL) { - log(LOG_ERR, "in_arp: runt packet -- m_pullup failed\n"); + log(LOG_NOTICE, "in_arp: runt packet -- m_pullup failed\n"); return; } @@ -537,13 +537,14 @@ * a protocol length not equal to an IPv4 address. */ if (ah->ar_pln != sizeof(struct in_addr)) { - log(LOG_ERR, "in_arp: requested protocol length != %zu\n", + log(LOG_NOTICE, "in_arp: requested protocol length != %zu\n", sizeof(struct in_addr)); return; } if (ETHER_IS_MULTICAST(ar_sha(ah))) { - log(LOG_ERR, "in_arp: source hardware address is multicast."); + log(LOG_NOTICE, "in_arp: %*D is multicast\n", + ifp->if_addrlen, (u_char *)ar_sha(ah), ":"); return; } @@ -645,7 +646,7 @@ if (!bcmp(ar_sha(ah), enaddr, ifp->if_addrlen)) goto drop; /* it's from me, ignore it. */ if (!bcmp(ar_sha(ah), ifp->if_broadcastaddr, ifp->if_addrlen)) { - log(LOG_ERR, + log(LOG_NOTICE, "arp: link address is broadcast for IP address %s!\n", inet_ntoa(isaddr)); goto drop; @@ -681,7 +682,7 @@ /* the following is not an error when doing bridging */ if (!bridged && la->lle_tbl->llt_ifp != ifp && !carp_match) { if (log_arp_wrong_iface) -log(LOG_ERR, "arp: %s is on %s " +