Hi all, I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to kernel to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug.
Thanks all of your help. Best Regards, Joe On 11/29/12 23:52, Fujinaka, Todd wrote: > Someone else pointed this out to me locally. If you have a non-client BIOS, > you should be able to set the MaxPayloadSize using setpci. You have to make > sure that you're being consistent throughout all the associated links. > > Todd Fujinaka > Technical Marketing Engineer > LAN Access Division (LAD) > Intel Corporation > todd.fujin...@intel.com > (503) 712-4565 > > > -----Original Message----- > From: Ethan Zhao [mailto:ethan.ker...@gmail.com] > Sent: Wednesday, November 28, 2012 7:10 PM > To: Fujinaka, Todd > Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; net...@vger.kernel.org; > e1000-de...@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci > Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang > > Joe, > Possibly your customer is running a kernel without source code on a > platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell > server ?). > Anyway, to see if is a payload issue or, you could change the payload > size with setpci tool to those devices and set the link retrain bit to > trigger the link retraining to debug the issue and identity the root cause. > I thinks it is much easier than modify the BIOS or eeprom of NIC. > > e.g. > set device control register to 0f 00 (128 bytes payload size) > # setpci -v -s 00:02.0 98.w=000f > set device link control register to 60h (retrain the link) > # setpci -v -s 00:02.0 a0.b=60 > > Hope it works, Just my 2 cents. > > ethan.z...@oracle.com > > On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujin...@intel.com> > wrote: >> The only EEPROM I know about or can speak to is the one attached to the >> 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. >> >> Todd Fujinaka >> Technical Marketing Engineer >> LAN Access Division (LAD) >> Intel Corporation >> todd.fujin...@intel.com >> (503) 712-4565 >> >> >> -----Original Message----- >> From: Joe Jin [mailto:joe....@oracle.com] >> Sent: Wednesday, November 28, 2012 12:31 AM >> To: Ben Hutchings >> Cc: Fujinaka, Todd; Mary Mcgrath; net...@vger.kernel.org; >> e1000-de...@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >> >> On 11/28/12 02:10, Ben Hutchings wrote: >>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>>> Forgive me if I'm being too repetitious as I think some of this has >>>> been mentioned in the past. >>>> >>>> We (and by we I mean the Ethernet part and driver) can only change >>>> the advertised availability of a larger MaxPayloadSize. The size is >>>> negotiated by both sides of the link when the link is established. >>>> The driver should not change the size of the link as it would be >>>> poking at registers outside of its scope and is controlled by the >>>> upstream bridge (not us). >>> [...] >>> >>> MaxPayloadSize (MPS) is not negotiated between devices but is >>> programmed by the system firmware (at least for devices present at >>> boot - the kernel may be responsible in case of hotplug). You can >>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >>> others) to set a policy that overrides this, but no policy will allow >>> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >>> >> >> Ben, >> >> Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. >> So I'm trying to use ethtool modify it from eeprom to see if help or no. >> >> >> Todd, I'll review all MaxPayload for all devices, but need to say if it >> mismatch, customer could not modify it from BIOS for there was not entry at >> there, to test it, we have to find how to verify if this is the root cause, >> so still need to find the offset in eeprom. >> >> Thanks in advance, >> Joe >> -- Oracle <http://www.oracle.com> Joe Jin | Software Development Senior Manager | +8610.6106.5624 ORACLE | Linux and Virtualization No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/