ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
The Machine: I have a dual Xeon 5130 machine, Supermicro motherboard, with the 82563EB NIC. From dmesg: CPU: Intel(R) Xeon(R) CPU5130 @ 2.00GHz (2000.08-MHz 686-class CPU) cpu0: on acpi0 em0: port 0x2000-0x201f mem 0xda00-0xda01 irq 18 at device 0.0 on pci4 The machine has 4G RAM and a 3ware 9000 series RAID controller with 2 drives. pciconf -l says: [EMAIL PROTECTED]:0:0: class=0x02 card=0x15d9 chip=0x10968086 rev=0x01 hdr=0x00 [EMAIL PROTECTED]:0:1: class=0x02 card=0x15d9 chip=0x10968086 rev=0x01 hdr=0x00 The symptom: The machine boots OK, but can only intermittently make netork connections. Eventually determined that it seems to only see a few ARP packets, so it's falling out of other machines' ARP tables, and is often unable to see the replies to its own ARP requests. It does see SOME ARPs though. When it is able to communicate with another machine, it does not appear to drop any packets between them (e.g. I scp'd a 500M file at 300Mbps to this machine). When I run "tcpdump -n arp" I see a few ARPs, but not many. In a 1-minute period, I saw 3 ARP who-has/reply packets. On a different machine on the same ethernet switch, I saw 225 who-has/reply packets in the same 1-minute period. I've tried different cables, and a different switch. I started with 6.2-RELEASE, and then went to 6.2-STABLE on 3/3/07 to get the latest em driver fixes. I've used SMP and GENERIC kernels. I get the same results in all cases. There are no firewall rules installed. I plugged in a USB ethernet adapter (realtek), and it works straight away. "tcpdump -n arp" sees the same noise as other machines on that LAN. I read through the recent threads on the em driver, but didn't see any reported symptoms like this. Has anyone seen anything like this? Got any hints for me? Am I doing something stupid? Did I leave out any useful information about my configuration? Thanks, Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
On Sun, Mar 04, 2007 at 11:37:01PM -0800, Jack Vogel wrote: > > These are one of our latest NICs, I have had no trouble with these > but I'm used to using them on an Intel design, not SuperMicro. > > First question, do you get the same behavior on both ports? > My first guess is that this is a BIOS/management problem. > > Double check SM website and see if there's any support updates > to firmware for the system. I left out a couple of things. Yes, it does the same thing on both em0 and em1. And, the inhouse linux advocate loaded debian on the box and that worked as expected. I'll check SM's web site for BIOS updates today. Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
SBKR100 USB 10/100 LAN, rev 1.10/1.00, addr 2 miibus0: on rue0 ruephy0: on miibus0 ruephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rue0: Ethernet address: 00:10:60:dd:ed:e9 rue0: if_start running deferred for Giant Timecounter "TSC" frequency 278406 Hz quality 800 Timecounters tick every 1.000 msec acd0: CDRW at ata0-master UDMA33 da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 100.000MB/s transfers da0: 238408MB (488259584 512 byte sectors: 255H 63S/T 30392C) da1 at twa0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 100.000MB/s transfers da1: 238408MB (488259584 512 byte sectors: 255H 63S/T 30392C) Trying to mount root from ufs:/dev/da0s1a em0: link state changed to UP em0: promiscuous mode enabled em0: promiscuous mode disabled twa0: INFO: (0x04: 0x0029): Verify started: unit=0 twa0: INFO: (0x04: 0x002B): Verify completed: unit=0 This is while booting GENERIC. I can boot SMP and send that too if you suggest. Here's vmstat -i: interrupt total rate irq1: atkbd0 2 0 irq6: fdc0 3 0 irq14: ata0 47 0 irq16: uhci3 14836 0 irq17: uhci0 ehci025 0 irq18: em0 uhci2 91850 2 irq24: twa014828 0 cpu0: timer 79015190 1999 Total 79136781 2003 Is the fact that em0 and uhci2 are sharing an interrupt significant? Thanks, Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
On Mon, Mar 05, 2007 at 10:02:26AM -0800, Jack Vogel wrote: > On 3/5/07, Jack Vogel <[EMAIL PROTECTED]> wrote: > >On 3/5/07, Mark Costlow <[EMAIL PROTECTED]> wrote: > >> On Mon, Mar 05, 2007 at 08:41:01AM -0800, Jack Vogel wrote: > >> > > > >> > >Maybe more of your dmesg might help as it could show interrrupt issues > >> > >that perhaps others could help diagnose > >> > > >> > Yes, agreed, this might be revealing. > >> > >> Here's the full dmesg. Thanks for looking at this. > >> > >> > >> Copyright (c) 1992-2007 The FreeBSD Project. > >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > >> The Regents of the University of California. All rights reserved. > >> FreeBSD is a registered trademark of The FreeBSD Foundation. > >> FreeBSD 6.2-STABLE #0: Sun Mar 4 22:40:38 MST 2007 > >> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC > >> ACPI APIC Table: > >> Timecounter "i8254" frequency 1193182 Hz quality 0 > >> CPU: Intel(R) Xeon(R) CPU5130 @ 2.00GHz (2000.08-MHz > >686-class CPU) > >> Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 > >> > >Features=0xbfebfbff >> MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > >> > >Features2=0x4e33d,CX16,,,> > >> AMD Features=0x2000 > >> AMD Features2=0x1 > >> Cores per package: 2 > >> real memory = 3489005568 (3327 MB) > >> avail memory = 3414384640 (3256 MB) > >> ioapic0 irqs 0-23 on motherboard > >> ioapic1 irqs 24-47 on motherboard > >> kbd1 at kbdmux0 > >> ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, > >RF5413) > >> acpi0: on motherboard > >> acpi0: Power Button (fixed) > >> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > >> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > >> cpu0: on acpi0 > >> acpi_throttle0: on cpu0 > >> pcib0: port 0xcf8-0xcff on acpi0 > >> pci0: on pcib0 > >> pcib1: at device 2.0 on pci0 > >> pci1: on pcib1 > >> pcib2: irq 16 at device 0.0 on pci1 > >> pci2: on pcib2 > >> pcib3: irq 16 at device 0.0 on pci2 > >> pci3: on pcib3 > >> pcib4: irq 18 at device 2.0 on pci2 > >> pci4: on pcib4 > >> em0: port > >0x2000-0x201f m > >> em 0xda00-0xda01 irq 18 at device 0.0 on pci4 > >> em0: Ethernet address: 00:30:48:8c:71:54 > >> em1: port > >0x2020-0x203f m > >> em 0xda02-0xda03 irq 19 at device 0.1 on pci4 > >> em1: Ethernet address: 00:30:48:8c:71:55 > >> pcib5: at device 0.3 on pci1 > >> pci5: on pcib5 > >> 3ware device driver for 9000 series storage controllers, version: > >3.60.02.012 > >> twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem > >0xd800-0 > >> xd9ff,0xda10-0xda100fff irq 24 at device 1.0 on pci5 > >> twa0: [GIANT-LOCKED] > >> twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-4LP, 4 > >ports, Firm > >> ware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > >> pci0: at device 8.0 (no driver attached) > >> pcib6: irq 17 at device 28.0 on pci0 > >> pci6: on pcib6 > >> uhci0: port 0x1800-0x181f irq 17 at > >device 29.0 > >> on pci0 > >> uhci0: [GIANT-LOCKED] > >> usb0: on uhci0 > >> usb0: USB revision 1.0 > >> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > >> uhub0: 2 ports with 2 removable, self powered > >> uhci1: port 0x1820-0x183f irq 19 at > >device 29.1 > >> on pci0 > >> uhci1: [GIANT-LOCKED] > >> usb1: on uhci1 > >> usb1: USB revision 1.0 > >> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > >> uhub1: 2 ports with 2 removable, self powered > >> uhci2: port 0x1840-0x185f irq 18 at > >device 29.2 > >> on pci0 > >> uhci2: [GIANT-LOCKED] > >> usb2: on uhci2 > >> usb2: USB revision 1.0 > >> uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > >> uhub2: 2 ports with 2 removable, self powered > >> uhci3: port 0x1860-0x187f irq 16 at > >device 29.3 > >> on pci0 > >> uhci3: [GIANT-LOCKED] > >> usb3: on uhci3 > >> usb3: USB revision 1.0 > >> uhub3: Intel UHCI r
Re: PATCH : ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
On Mon, Mar 05, 2007 at 02:13:36PM -0800, Jack Vogel wrote: [...snip...] > >> > >> Don't bother installing CURRENT, just got out of my meeting and I found > >> out what the problem is. There is indeed an issue with management, and > >> its something our test group isnt set up to test. I will send a patch to > >> try sometime before end of day. > > OK, here is the patch, this should fix it... Hi Jack, the patch didn't seem to have any effect. When I run "tcpdump -n arp" after rebooting with this patch, I still see 2-3 ARPs per minute instead of 100-200 per minute. I was patching against: /*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.22 2007/03/01 17:32:27 csjp Exp $*/ Is that correct? I tried both SMP and non-SMP kernels, with same results. Is there anything I can do to gather some additional debug information from the system while it's running? I neglected to mention before the specific motherboard model: Supermicro X7DVL-E. There is no IPMI card installed, and no IPMI setting in the BIOS. Thanks, Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em
On Tue, Mar 06, 2007 at 10:24:46AM +, Chris Rees wrote: > > If your NIC is knackered, where are you from? I can post you one I'm > not using, instead of you buying one. It's a Realtek PCI 8139 10/100 > Mb/s. Let me know if you're interested. Thank you very much for the offer! However, I have tried another NIC in the machine (a Realtek USB adaptor) and it worked normally. At that point I would suspect the hardware except that when this machine had linux loaded on it, it worked normally. The box is in a 1U case with no spare PCI slots, so I need the motherboard NIC to work for it to be useful long-term. Thanks, Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: File system failure! URGENT Help needed!
On Sun, Jun 23, 2002 at 04:03:15PM +0600, [EMAIL PROTECTED] wrote: > HL> If you have a spare drive that is exactly the same type you could make a > HL> binary copy of the disk to it, and don't need to worry about making the > HL> prolem worse while recovering the data with inode magic. > > Yes, of course, I did it. Thanks! Now that you have a copy of the data, this tool can help you extract your data from the damaged file system: http://www.porcupine.org/forensics/tct.html Be sure to read the instructions carefully before starting to use it. It is a slow process, but it works. Mark -- Mark Costlow| Southwest Cyberport | Fax: +1-505-232-7975 [EMAIL PROTECTED] | Web: www.swcp.com | Voice: +1-505-232-7992 "Education is never a waste" - Viscount du Valmont To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message