Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Hi All, I've got almost the same problem with intel 82574L based nic. My platform is nvidia ion running Atom 1.6 and nic is an external PCI-express adapter. Unlike Jason's case mine is always stuck in receiving traffic, it's Ierrs increasing while Ipkts not. Thanks to Jason's script I can see those locks and interface flapping every several hours. My system is not a heavy loaded server but just a home nas/router, usually routing at 100 mbps or less. Nither disabling MSIX nor tuning txd rxd doesn't help me. Hi, When this occurs, does this completely lock up with RX traffic? Ie, _no_ valid RX traffic occurs? If so, does the error count increase 1:1 with what traffic you're trying to send to it? Hi Adrian Yes, it's completely locked - ping packet loss to the interface is 100%, I also managed to run tcpdump -ni em0 and there were only outgoing packets from the interface but no packets arrived to the interface. As for the error counter ratio - I'm not sure about it, don't know how check this besides comparing counters on the switch with interface, but it's a little bit complicated to catch in time. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
On 10/27/2011 9:59 AM, Emil Muratov wrote: Hi Hooman Here is what I've got when the script triggered just in time when the interface was locked 11.10.26-23:39:10 ... interface em0 is down... FreeBSD ion.hotplug.ru 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Oct 20 20:20:25 MSD 2011 r...@epia.home .lan:/usr/obj/usr/src/sys/ION6debug amd64 11:39PM up 1:12, 2 users, load averages: 0.26, 0.48, 0.58 == vmstat -i == interrupt total rate irq22: nfe0 16644480 3865 cpu0: timer 8610122 1999 irq256: ahci0 606705140 irq257: em0:rx 0 3896622904 irq258: em0:tx 0 2762957641 irq259: em0:link 620 0 cpu3: timer 8609499 1999 cpu1: timer 8609499 1999 cpu2: timer 8609499 1999 Total 58350003 13550 == netstat -ind == NameMtu Network Address Ipkts Ierrs IdropOpkts Oerrs Coll Drop usbus 00 0 00 0 00 usbus 00 0 00 0 00 nfe0 1500 00:25:22:21:86:89 7157140 0 0 12266747 0 00 nfe0 1500 fe80::225:22f fe80::225:22ff:fe0 - - 85 - -- nfe0 1500 10.16.128.0/1 10.16.189.71 0 - -48135 - -- em09000 00:1b:21:ab:bf:4a 5465087 623 0 2862028 0 0 113 em09000 192.168.168.0 192.168.168.1 764085 - - 1005078 - -- em09000 fe80::21b:21f fe80::21b:21ff:fe 45 - - 252 - -- em09000 2002:d58d:871 2002:d58d:8715:1: 73 - - 38 - -- wifi 1500 00:1b:21:ab:bf:4a 347 0 0 350 0 00 wifi 1500 192.168.168.6 192.168.168.65 0 - -0 - -- wifi 1500 fe80::225:x fe80::225:x:x0 - - 349 - - - wifi 1500 2002:x:x 2002:x:x:2:0 - -0 - -- wifio 1500 00:1b:21:ab:bf:4a59559 0 0 114639 0 00 wifio 1500 192.168.168.8 192.168.168.81 0 - - 160 - -- wifio 1500 fe80::225:x fe80::225:x:x0 - -0 - - - stf0 1280 5725 0 0 6125 420 00 stf0 1280 2002:x:x 2002:x:x::1 1878 - - 1121 - -- ng0* 1500 0 0 00 0 00 ng1* 1500 0 0 00 0 00 ng21492 7143733 0 0 12234436 0 00 ng21492 213.141.x.x 213.141.x.x 4735932 - - 8480089 - -- ng21492 fe80::x:x fe80::x:x:x0 - -1 - -- tun0 1455 350 0 0 172 0 00 tun0 1455 fe80::225:x fe80::225:x:x0 - -2 - - - tun0 1455 192.168.169.1 192.168.169.1 117 - - 167 - -- Oct 26 23:39:11 ion kernel: em0: hw tdh = 975, hw tdt = 944 Oct 26 23:39:11 ion kernel: em0: hw rdh = 960, hw rdt = 959 Oct 26 23:39:11 ion kernel: em0: Tx Queue Status = 1 Oct 26 23:39:11 ion kernel: em0: TX descriptors avail = 31 Oct 26 23:39:11 ion kernel: em0: Tx Descriptors avail failure = 0 Oct 26 23:39:11 ion kernel: em0: RX discarded packets = 0 Oct 26 23:39:11 ion kernel: em0: RX Next to Check = 960 Oct 26 23:39:11 ion kernel: em0: RX Next to Refresh = 959 net.inet.ip.intr_queue_maxlen: 4096 net.inet.ip.intr_queue_drops: 0 dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 dev.em.0.%driver: em dev.em.0.%location: slot=0 function=0 dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0xa01f class=0x02 dev.em.0.%parent: pci2 dev.em.0.nvm: -1 dev.em.0.debug: -1 dev.em.0.rx_int_delay: 200 dev.em.0.tx_int_delay: 200 dev.em.0.rx_abs_int_delay: 4096 dev.em.0.tx_abs_int_delay: 4096 dev.em.0.rx_processing_limit: 100 dev.em.0.flow_control: 3 dev.em.0.eee_control: 0 dev.em.0.link_irq: 648 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1477444168 dev.em.0.rx_control: 100827170 dev.em.0.fc_high_water: 11264 dev.em.0.fc_low_water: 9764 dev.em.0.queue0.txd_head: 975 dev.em.0.queue0.txd_tail: 944 dev.em.0.queue0.tx_irq: 2762762 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 960 dev.em.0.queue0.rxd_tail: 959 dev.em.0.queue0.rx_irq: 3895860 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Hello, Mike. You wrote 7 октября 2011 г., 19:06:34: > This sure sounds like the issue I was seeing with the 7.1.9 driver... > However, it has been fixed for me by going to 7.2.3, which is in > RELENG_8. Is it possible you have a couple of issues going on since you > are using lagg as well ? Another problem some folks have reported is > that in the BIOS, if you have an option for ASPM, make sure its disabled. I had a lot of such problems with 7.1.9 on my 82566DM, and I thought, that new driver is Ok, but yesterday it happens again with 7.2.3. No packets could be sent, buffers are overfilled, only full reset helps (after "ifconfig wm0 down && ifconfig em0 up" ping starts to report "Host is down" for any remote host, instead of "No buffer space available")... 8-STABLE, 7.2.3 driver, amd64, 82566DM LOM chip. -- // Black Lion AKA Lev Serebryakov ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIXenabled
What did netstat -m show? Regards Steve - Original Message - From: "Lev Serebryakov" To: "Mike Tancsa" Cc: Sent: Thursday, October 27, 2011 10:26 AM Subject: Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIXenabled Hello, Mike. You wrote 7 октября 2011 г., 19:06:34: This sure sounds like the issue I was seeing with the 7.1.9 driver... However, it has been fixed for me by going to 7.2.3, which is in RELENG_8. Is it possible you have a couple of issues going on since you are using lagg as well ? Another problem some folks have reported is that in the BIOS, if you have an option for ASPM, make sure its disabled. I had a lot of such problems with 7.1.9 on my 82566DM, and I thought, that new driver is Ok, but yesterday it happens again with 7.2.3. No packets could be sent, buffers are overfilled, only full reset helps (after "ifconfig wm0 down && ifconfig em0 up" ping starts to report "Host is down" for any remote host, instead of "No buffer space available")... 8-STABLE, 7.2.3 driver, amd64, 82566DM LOM chip. -- // Black Lion AKA Lev Serebryakov ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re[2]: PCI-E VT6130 NIC (if_vge) hang system with gigabit link
26 октября 2011, 20:49 от YongHyeon PYUN : > On Wed, Oct 26, 2011 at 01:49:14PM +0400, Andrey Smagin wrote: > > Hi ! > > vge0@pci0:2:0:0:class=0x02 card=0x01101106 chip=0x31191106 > > rev=0x82 hdr=0x00 > > vendor = 'VIA Technologies, Inc.' > > device = 'VT6120/VT6121/VT6122 Gigabit Ethernet Adapter' > > class = network > > subclass = ethernet > > bar [10] = type I/O Port, range 32, base 0x8000, size 256, enabled > > bar [14] = type Memory, range 64, base 0xd410, size 256, enabled > > cap 01[50] = powerspec 3 supports D0 D1 D2 D3 current D0 > > cap 10[90] = PCI-Express 1 endpoint max data 128(128) link x1(x1) > > cap 05[c0] = MSI supports 1 message, 64 bit, vector masks enabled with > > 1 message > > ecap 0001[100] = AER 1 0 fatal 1 non-fatal 6 corrected > > ecap 0003[130] = Serial 1 1106 > > vge1@pci0:1:0:0:class=0x02 card=0x01101106 chip=0x31191106 > > rev=0x82 hdr=0x00 > > vendor = 'VIA Technologies, Inc.' > > device = 'VT6120/VT6121/VT6122 Gigabit Ethernet Adapter' > > class = network > > subclass = ethernet > > bar [10] = type I/O Port, range 32, base 0x7000, size 256, enabled > > bar [14] = type Memory, range 64, base 0xd400, size 256, enabled > > cap 01[50] = powerspec 3 supports D0 D1 D2 D3 current D0 > > cap 10[90] = PCI-Express 1 endpoint max data 128(128) link x1(x1) > > cap 05[c0] = MSI supports 1 message, 64 bit, vector masks enabled with > > 1 message > > ecap 0001[100] = AER 1 1 fatal 1 non-fatal 6 corrected > > ecap 0003[130] = Serial 1 1106 > > > > dmesg is empty > Hmm, check whether you have old dmesg file in /var/log directory. > At least, I need output of 'devinfo -rv' to know which PHY you have. pcib4 pnpinfo vendor=0x10de device=0x005d subvendor=0x subdevice=0x class=0x060400 at slot=13 function=0 handle=\_SB_.PCI0.XVR1 I/O ports: 0x8000-0x8fff I/O memory addresses: 0xd410-0xd41f pci2 vge0 pnpinfo vendor=0x1106 device=0x3119 subvendor=0x1106 subdevice=0x0110 class=0x02 at slot=0 function=0 Interrupt request lines: 256 pcib4 I/O port window: 0x8000-0x80ff pcib4 memory window: 0xd410-0xd41000ff miibus1 ip1000phy0 pnpinfo oui=0x9c3 model=0x19 rev=0x0 at phyno=1 pcib5 pnpinfo vendor=0x10de device=0x005d subvendor=0x subdevice=0x class=0x060400 at slot=14 function=0 handle=\_SB_.PCI0.XVR0 I/O ports: 0x7000-0x7fff I/O memory addresses: 0xd400-0xd40f pci1 vge1 pnpinfo vendor=0x1106 device=0x3119 subvendor=0x1106 subdevice=0x0110 class=0x02 at slot=0 function=0 Interrupt request lines: 257 pcib5 I/O port window: 0x7000-0x70ff pcib5 memory window: 0xd400-0xd4ff miibus2 ip1000phy1 pnpinfo oui=0x9c3 model=0x19 rev=0x0 at phyno=1 > By chance, are you using manual link configuration instead of > relying on auto-negotiation? I tried manual link and auto-negotiation. Now I not remember which mode hang system. I will try and write result. > > > > > 26 октября 2011, 00:11 от YongHyeon PYUN : > > > On Tue, Oct 25, 2011 at 10:44:48AM +0400, Andrey Smagin wrote: > > > > Hi ALL ! If I connect gigabit switch to my card - system hang until I > > > > unplug patchcord from device. > > > > With 100Mbit switch card work good. > > > > > > Show me the output of dmesg and 'pciconf -lcbv'. > > > > > > > System: Current - r226163 > > > > > > > > > > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIXenabled
Hello, Steven. You wrote 27 октября 2011 г., 13:49:29: > What did netstat -m show? Nothing criminal :( 13414/2921/16335 mbufs in use (current/cache/total) 4997/533/5530/204800 mbuf clusters in use (current/cache/total/max) 4626/329 mbuf+clusters out of packet secondary zone in use (current/cache) 78/2976/3054/192000 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 13659K/13700K/27359K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines -- // Black Lion AKA Lev Serebryakov ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Hi Hooman Here is what I've got when the script triggered just in time when the interface was locked 11.10.26-23:39:10 ... interface em0 is down... FreeBSD ion.hotplug.ru 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Oct 20 20:20:25 MSD 2011 r...@epia.home .lan:/usr/obj/usr/src/sys/ION6debug amd64 11:39PM up 1:12, 2 users, load averages: 0.26, 0.48, 0.58 == vmstat -i == interrupt total rate irq22: nfe0 16644480 3865 cpu0: timer 8610122 1999 irq256: ahci0 606705140 irq257: em0:rx 0 3896622904 irq258: em0:tx 0 2762957641 irq259: em0:link 620 0 cpu3: timer 8609499 1999 cpu1: timer 8609499 1999 cpu2: timer 8609499 1999 Total 58350003 13550 == netstat -ind == NameMtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll Drop usbus 00 0 00 0 00 usbus 00 0 00 0 00 nfe0 1500 00:25:22:21:86:89 7157140 0 0 12266747 0 00 nfe0 1500 fe80::225:22f fe80::225:22ff:fe0 - - 85 - -- nfe0 1500 10.16.128.0/1 10.16.189.71 0 - - 48135 - -- em09000 00:1b:21:ab:bf:4a 5465087 623 0 2862028 0 0 113 em09000 192.168.168.0 192.168.168.1 764085 - - 1005078 - -- em09000 fe80::21b:21f fe80::21b:21ff:fe 45 - - 252 - -- em09000 2002:d58d:871 2002:d58d:8715:1: 73 - - 38 - -- wifi 1500 00:1b:21:ab:bf:4a 347 0 0 350 0 00 wifi 1500 192.168.168.6 192.168.168.65 0 - -0 - -- wifi 1500 fe80::225:x fe80::225:x:x0 - - 349 - -- wifi 1500 2002:x:x 2002:x:x:2:0 - -0 - -- wifio 1500 00:1b:21:ab:bf:4a59559 0 0 114639 0 00 wifio 1500 192.168.168.8 192.168.168.81 0 - - 160 - -- wifio 1500 fe80::225:x fe80::225:x:x0 - - 0 - -- stf0 1280 5725 0 0 6125 420 00 stf0 1280 2002:x:x 2002:x:x::1 1878 - - 1121 - -- ng0* 1500 0 0 00 0 00 ng1* 1500 0 0 00 0 00 ng21492 7143733 0 0 12234436 0 00 ng21492 213.141.x.x 213.141.x.x 4735932 - - 8480089 - -- ng21492 fe80::x:x fe80::x:x:x0 - -1 - -- tun0 1455 350 0 0 172 0 00 tun0 1455 fe80::225:x fe80::225:x:x0 - - 2 - -- tun0 1455 192.168.169.1 192.168.169.1 117 - - 167 - -- Oct 26 23:39:11 ion kernel: em0: hw tdh = 975, hw tdt = 944 Oct 26 23:39:11 ion kernel: em0: hw rdh = 960, hw rdt = 959 Oct 26 23:39:11 ion kernel: em0: Tx Queue Status = 1 Oct 26 23:39:11 ion kernel: em0: TX descriptors avail = 31 Oct 26 23:39:11 ion kernel: em0: Tx Descriptors avail failure = 0 Oct 26 23:39:11 ion kernel: em0: RX discarded packets = 0 Oct 26 23:39:11 ion kernel: em0: RX Next to Check = 960 Oct 26 23:39:11 ion kernel: em0: RX Next to Refresh = 959 net.inet.ip.intr_queue_maxlen: 4096 net.inet.ip.intr_queue_drops: 0 dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 dev.em.0.%driver: em dev.em.0.%location: slot=0 function=0 dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0xa01f class=0x02 dev.em.0.%parent: pci2 dev.em.0.nvm: -1 dev.em.0.debug: -1 dev.em.0.rx_int_delay: 200 dev.em.0.tx_int_delay: 200 dev.em.0.rx_abs_int_delay: 4096 dev.em.0.tx_abs_int_delay: 4096 dev.em.0.rx_processing_limit: 100 dev.em.0.flow_control: 3 dev.em.0.eee_control: 0 dev.em.0.link_irq: 648 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1477444168 dev.em.0.rx_control: 100827170 dev.em.0.fc_high_water: 11264 dev.em.0.fc_low_water: 9764 dev.em.0.queue0.txd_head: 975 dev.em.0.queue0.txd_tail: 944 dev.em.0.queue0.tx_irq: 2762762 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 960 dev.em.0.queue0.rxd_tail: 959 dev.em.0.queue0.rx_irq: 3895860 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.
Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled
Hi, On Thu, Oct 27, 2011 at 2:29 AM, Emil Muratov wrote: > > >> Hi, >> >> Can yan you pls post the output of these command _when_ the problem >> happens? >> >> uname -a >> sysctl dev.em >> netstat -ind >> ifconfig >> > > Hi Hooman > > Here is what I've got when the script triggered just in time when the > interface was locked > > > 11.10.26-23:39:10 ... interface em0 is down... > > FreeBSD ion.hotplug.ru 8.2-STABLE FreeBSD 8.2-STABLE #0: Thu Oct 20 20:20:25 > Please upgrade to 8-STABLE, similar issues have been fixed there. Thanks, - Arnaud > MSD 2011 r...@epia.home > .lan:/usr/obj/usr/src/sys/ION6debug amd64 > 11:39PM up 1:12, 2 users, load averages: 0.26, 0.48, 0.58 > > > == vmstat -i == > interrupt total rate > irq22: nfe0 16644480 3865 > cpu0: timer 8610122 1999 > irq256: ahci0 606705 140 > irq257: em0:rx 0 3896622 904 > irq258: em0:tx 0 2762957 641 > irq259: em0:link 620 0 > cpu3: timer 8609499 1999 > cpu1: timer 8609499 1999 > cpu2: timer 8609499 1999 > Total 58350003 13550 > > == netstat -ind == > Name Mtu Network Address Ipkts Ierrs Idrop Opkts > Oerrs Coll Drop > usbus 0 0 0 0 0 > 0 0 0 > usbus 0 0 0 0 0 > 0 0 0 > nfe0 1500 00:25:22:21:86:89 7157140 0 0 12266747 > 0 0 0 > nfe0 1500 fe80::225:22f fe80::225:22ff:fe 0 - - 85 > - - - > nfe0 1500 10.16.128.0/1 10.16.189.71 0 - - 48135 > - - - > em0 9000 00:1b:21:ab:bf:4a 5465087 623 0 2862028 > 0 0 113 > em0 9000 192.168.168.0 192.168.168.1 764085 - - 1005078 > - - - > em0 9000 fe80::21b:21f fe80::21b:21ff:fe 45 - - 252 > - - - > em0 9000 2002:d58d:871 2002:d58d:8715:1: 73 - - 38 > - - - > wifi 1500 00:1b:21:ab:bf:4a 347 0 0 350 > 0 0 0 > wifi 1500 192.168.168.6 192.168.168.65 0 - - 0 > - - - > wifi 1500 fe80::225:x fe80::225:x:x 0 - - 349 - > - - > wifi 1500 2002:x:x 2002:x:x:2: 0 - - 0 - - > - > wifio 1500 00:1b:21:ab:bf:4a 59559 0 0 114639 > 0 0 0 > wifio 1500 192.168.168.8 192.168.168.81 0 - - 160 > - - - > wifio 1500 fe80::225:x fe80::225:x:x 0 - - 0 - > - - > stf0 1280 5725 0 0 6125 > 420 0 0 > stf0 1280 2002:x:x 2002:x:x::1 1878 - - 1121 - - > - > ng0* 1500 0 0 0 0 > 0 0 0 > ng1* 1500 0 0 0 0 > 0 0 0 > ng2 1492 7143733 0 0 12234436 > 0 0 0 > ng2 1492 213.141.x.x 213.141.x.x 4735932 - - 8480089 - > - - > ng2 1492 fe80::x:x fe80::x:x:x 0 - - 1 - - > - > tun0 1455 350 0 0 172 > 0 0 0 > tun0 1455 fe80::225:x fe80::225:x:x 0 - - 2 - > - - > tun0 1455 192.168.169.1 192.168.169.1 117 - - 167 > - - - > > Oct 26 23:39:11 ion kernel: em0: hw tdh = 975, hw tdt = 944 > Oct 26 23:39:11 ion kernel: em0: hw rdh = 960, hw rdt = 959 > Oct 26 23:39:11 ion kernel: em0: Tx Queue Status = 1 > Oct 26 23:39:11 ion kernel: em0: TX descriptors avail = 31 > Oct 26 23:39:11 ion kernel: em0: Tx Descriptors avail failure = 0 > Oct 26 23:39:11 ion kernel: em0: RX discarded packets = 0 > Oct 26 23:39:11 ion kernel: em0: RX Next to Check = 960 > Oct 26 23:39:11 ion kernel: em0: RX Next to Refresh = 959 > > net.inet.ip.intr_queue_maxlen: 4096 > net.inet.ip.intr_queue_drops: 0 > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3 > dev.em.0.%driver: em > dev.em.0.%location: slot=0 function=0 > dev.em.0.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 > subdevice=0xa01f class=0x02 > dev.em.0.%parent: pci2 > dev.em.0.nvm: -1 > dev.em.0.debug: -1 > dev.em.0.rx_int_delay: 200 > dev.em.0.tx_int_delay: 200 > dev.em.0.rx_abs_int_delay: 4096 > dev.em.0.tx_abs_int_delay: 4096 > dev.em.0.rx_processing_limit: 100 > dev.em.0.flow_control: 3 > dev.em.0.eee_control: 0 > dev.em.0.link_irq: 648 > dev.em.0.mbuf_alloc_fail: 0 > dev.em.0.cluster_alloc_fail: 0 > dev.em.0.dropped: 0 > dev.em.0.tx_dma_fail: 0 > dev.em.0.rx_overruns: 0 > dev.em.0.watchdog_timeouts: 0 > dev.em.0.device_control: 1477444168 > d
Re: SCTP : problems in sending ASCONF chunks
OK, please try two fixes: http://svn.freebsd.org/changeset/base/226868 This fixes a problem which resulted in the ASCONF chunks not being sent. http://svn.freebsd.org/changeset/base/226869 This fixes a problem which resulted in the path confirmation chunks being sent to the wrong destination and therefore not confirming the path. Please let me know if you still have any issues. Best regards Michael On Oct 24, 2011, at 4:20 AM, jyl_2006 wrote: > Hi,Tüxen > > I will provide more detail: > The Topology is: > 1(192.168.1.20) > computer A ---1 computer B(192.168.1.80) > 2(192.168.1.50) > means computer has two wireless cards , I name them with A_1 and A_2, > computer B has one wireless cards, and its name is B_1. > > Now, A_1 init a association with B_1. Here, I provide message getting from > wireshark , > INIT: > Internet Protocol, Src: 192.168.1.20 (192.168.1.20), Dst: 192.168.1.80 > (192.168.1.80) > Supported Extensions parameter (Supported types: ASCONF, ASCONF_ACK, > FORWARD_TSN, PKTDROP, STREAM_RESET, AUTH) > INIT ACK: > Internet Protocol, Src: 192.168.1.80 (192.168.1.80), Dst: 192.168.1.20 > (192.168.1.20) > Supported Extensions parameter (Supported types: ASCONF, ASCONF_ACK, > FORWARD_TSN, PKTDROP, STREAM_RESET, AUTH) > > Debug message: > SCTP_SACK process cum_ack:45452434 num_seg:0 a_rwnd:1864135 > Check for chunk output prw:1864135 tqe:1 tf=0 > Send called addr:0xc591c980 send length 2 > Calling ipv4 output routine from low level src addr:c0a80114 > Destination is c0a80150 > RTP route is 0xc5caaaf8 through > IP output returns 0 > m-c-o put out 1 > Ok, we have put out 1 chunks > USR Send complete qo:0 prw:1863859 unsent:18 tf:18 cooq:1 toqs:18 err:0 > Ok laddr->ifa:0xc5baab00 is possible, asconf_queue_mgmt: inserted asconf > ADD_IP_ADDRESS: IPv4 address: 192.168.1.50:0 > m-c-o put out 0 > Ok, we have put out 0 chunks > sctp_input() length:28 iphlen:20 > sctp_input(): Packet of length 48 received on wlan0 with csum_flags 0x0. > Ok, Common input processing called, m:0xc72dab00 iphlen:20 offset:32 > length:48 stcb:0xcc2335dc > stcb:0xcc2335dc state:8 > sctp_process_control: iphlen=20, offset=32, length=48 stcb:0xcc2335dc > sctp_process_control: processing a chunk type=3, len=16 > > Actually,I also test following Topology : > 1(192.168.1.20) > --1(192.168.1.80) > computer A --- computer B > 2(192.168.2.20) > --2(192.168.2.80) > means computer has two wireless cards , I name them with A_1 and A_2, > computer B has two wireless cards, and its name is B_1 and B_2. > > The result from wireshark and debug message have same results. > > > -- > View this message in context: > http://freebsd.1045724.n5.nabble.com/SCTP-problems-in-sending-ASCONF-chunks-tp4929128p4931035.html > Sent from the freebsd-net mailing list archive at Nabble.com. > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: 9.0-RC1 panic in tcp_input: negative winow.
On 10/26/11 22:53, John Baldwin wrote: On Wednesday, October 26, 2011 3:54:31 am Pawel Jakub Dawidek wrote: On Mon, Oct 24, 2011 at 08:14:22AM -0400, John Baldwin wrote: On Sunday, October 23, 2011 11:58:28 am Pawel Jakub Dawidek wrote: On Sun, Oct 23, 2011 at 11:44:45AM +0300, Kostik Belousov wrote: On Sun, Oct 23, 2011 at 08:10:38AM +0200, Pawel Jakub Dawidek wrote: My suggestion would be that if we won't be able to fix it before 9.0, we should turn this assertion off, as the system seems to be able to recover. Shipped kernels have all assertions turned off. Yes, I'm aware of that, but many people compile their production kernels with INVARIANTS/INVARIANT_SUPPORT to fail early instead of eg. corrupting data. I'd be fine in moving this under DIAGNOSTIC or changing it into a printf, so it will be visible. No, the kernel is corrupting things in other places when this is true, so if you are running with INVARIANTS, we want to know about it. Specifically, in several places in TCP we assume that rcv_adv>= rcv_nxt, and depend on being able to do 'rcv_adv - rcv_nxt'. In this case, it looks like the difference is consistently less than one frame. I suspect the other end of the connection is sending just beyond the end of the advertised window (it probably assumes it is better to send a full frame if it has that much pending data even though part of it is beyond the window edge vs sending a truncated packet that just fills the window) and that that frame is accepted ok in the header prediction case and it's ACK is delayed, but the next packet to arrive then trips over this assumption. Since 'win' is guaranteed to be non-negative and we explicitly cast 'rcv_adv - rcv_nxt' to (int) in the following line that the assert is checking for: tp->rcv_wnd = imax(win, (int)(tp->rcv_adv - tp->rcv_nxt)); I think we already handle this case ok and perhaps the assertion can just be removed? Not sure if others feel that it warrants a comment to note that this is the case being handled. I added debug to the places where rcv_adv and rcv_nxt are modified. Here is what happens before the panic occurs: tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022361548 rcv_adv 4022360100 diff -1448 tcp_do_segment:2847 negative window: tp 0xfe000dab1b70 rcv_nxt 4022362298 rcv_adv 4022361548 diff -750 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022363746 rcv_adv 4022362298 diff -1448 tcp_do_segment:2847 negative window: tp 0xfe000dab1b70 rcv_nxt 4022364836 rcv_adv 4022363746 diff -1090 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022366284 rcv_adv 4022364836 diff -1448 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022370628 rcv_adv 4022369690 diff -938 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022379140 rcv_adv 4022377692 diff -1448 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022387792 rcv_adv 4022386344 diff -1448 tcp_do_segment:2847 negative window: tp 0xfe000dab1b70 rcv_nxt 4022388890 rcv_adv 4022387792 diff -1098 tcp_do_segment:1722 negative window: tp 0xfe000dab1b70 rcv_nxt 4022390338 rcv_adv 4022388890 diff -1448 tcp_do_segment:2847 negative window: tp 0xfe000dab1b70 rcv_nxt 4022394563 rcv_adv 4022394342 diff -221 panic: tcp_input negative window: tp 0xfe000dab1b70 rcv_nxt 4022394563 rcv_adv 4022394342 win=0 diff -221 I can send you the full log if you want, I've plenty of messages where rcv_adv< rcv_nxt, not all of them trigger this assertion. The assertion would be triggered when the next packet arrives (as I said above). Try modifying your debugging output to also log if the ACK is delayed. I suspect it is not delayed until the last one. (Pushing out an ACK will reset rcv_adv to be beyond rcv_nxt in tcp_output(), so in the case of an immediate ACK, rcv_nxt> rcv_adv is only a transient condition all under a single lock invocation so never visible to other consumers of the protocol control block.) If that is what you see, then that confirms what I guessed above and I will likely just remove the assertion in tcp_input() and patch the timewait code to handle this case. Pawel, have you been able to confirm John's hypothesis? What I don't quite get is why we haven't had a lot more reports of this issue... Cheers, Lawrence ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: 9.0-RC1 panic in tcp_input: negative winow.
On Fri, Oct 28, 2011 at 11:29:34AM +1100, Lawrence Stewart wrote: > On 10/26/11 22:53, John Baldwin wrote: > > The assertion would be triggered when the next packet arrives (as I said > > above). Try modifying your debugging output to also log if the ACK is > > delayed. I suspect it is not delayed until the last one. (Pushing out an > > ACK will reset rcv_adv to be beyond rcv_nxt in tcp_output(), so in the case > > of an immediate ACK, rcv_nxt> rcv_adv is only a transient condition all > > under a single lock invocation so never visible to other consumers of the > > protocol control block.) If that is what you see, then that confirms what > > I guessed above and I will likely just remove the assertion in tcp_input() > > and patch the timewait code to handle this case. > > > > Pawel, have you been able to confirm John's hypothesis? [...] Yeah, sorry. I moved the debug to the points where we drop the t_inpcb lock and I still see rcv_nxt being greater than rcv_adv: tcp_do_segment:2970 negative window: tp 0xfe00685ee3d0 rcv_nxt 1312878324 rcv_adv 1312878187 This is just before the INP_WUNLOCK(tp->t_inpcb) under 'check_delack' label. I see this a lot (it was logged 545 times for 11 different tp pointers during 24h period). tcp_do_segment:3009 negative window: tp 0xfe005cfc6000 rcv_nxt 1442546453 rcv_adv 1442545722 This is just before calling tcp_output(). This one was logged 65 times for 3 different tp pointers. I placed a debug also after tcp_output() call, but it is not logged, so once we return from tcp_output() everything is fine. The panic would be triggered 115 times for 5 different tp pointers during that time. I write 'tp pointers' as I'm not 100% sure if the same pointer always represents the same connection or if it is reused. > [...] What I don't > quite get is why we haven't had a lot more reports of this issue... Maybe because my TCP/IP stack is heavly modified? ...not:) No idea to be honest. Ask Ken to turn on INVARIANTS in 9.0-RC2 and we will see:) -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp11UIhQjZvo.pgp Description: PGP signature