Hi Yijing, Thanks for your reference, the patch looks good for me, but I have no chance to test it on customer's env.
Best Regards, Joe On 12/19/12 13:52, Yijing Wang wrote: > On 2012/12/19 11:04, Joe Jin wrote: >> Hi all, >> >> I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to >> kernel >> to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug. >> > > Hi Joe, > I found similar problem when I do pci hotplug, discussion is > here:http://marc.info/?l=linux-pci&m=134810569924220&w=2. > We try to improve Linux kernel to debug this problem easily based Bjorn's > suggestion. Jon sent out the first version patch > http://marc.info/?l=linux-pci&m=135002016005274&w=2. > I think we can do further here, > http://marc.info/?l=linux-pci&m=135115581307869&w=2. I hope this information > can help you. > > Thanks! > Yijing. > >> Thanks all of your help. >> >> Best Regards, >> Joe >> >> On 11/29/12 23:52, Fujinaka, Todd wrote: >>> Someone else pointed this out to me locally. If you have a non-client BIOS, >>> you should be able to set the MaxPayloadSize using setpci. You have to make >>> sure that you're being consistent throughout all the associated links. >>> >>> Todd Fujinaka >>> Technical Marketing Engineer >>> LAN Access Division (LAD) >>> Intel Corporation >>> todd.fujin...@intel.com >>> (503) 712-4565 >>> >>> >>> -----Original Message----- >>> From: Ethan Zhao [mailto:ethan.ker...@gmail.com] >>> Sent: Wednesday, November 28, 2012 7:10 PM >>> To: Fujinaka, Todd >>> Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; net...@vger.kernel.org; >>> e1000-de...@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >>> >>> Joe, >>> Possibly your customer is running a kernel without source code on a >>> platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell >>> server ?). >>> Anyway, to see if is a payload issue or, you could change the payload >>> size with setpci tool to those devices and set the link retrain bit to >>> trigger the link retraining to debug the issue and identity the root cause. >>> I thinks it is much easier than modify the BIOS or eeprom of NIC. >>> >>> e.g. >>> set device control register to 0f 00 (128 bytes payload size) >>> # setpci -v -s 00:02.0 98.w=000f >>> set device link control register to 60h (retrain the link) >>> # setpci -v -s 00:02.0 a0.b=60 >>> >>> Hope it works, Just my 2 cents. >>> >>> ethan.z...@oracle.com >>> >>> On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujin...@intel.com> >>> wrote: >>>> The only EEPROM I know about or can speak to is the one attached to the >>>> 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. >>>> >>>> Todd Fujinaka >>>> Technical Marketing Engineer >>>> LAN Access Division (LAD) >>>> Intel Corporation >>>> todd.fujin...@intel.com >>>> (503) 712-4565 >>>> >>>> >>>> -----Original Message----- >>>> From: Joe Jin [mailto:joe....@oracle.com] >>>> Sent: Wednesday, November 28, 2012 12:31 AM >>>> To: Ben Hutchings >>>> Cc: Fujinaka, Todd; Mary Mcgrath; net...@vger.kernel.org; >>>> e1000-de...@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >>>> >>>> On 11/28/12 02:10, Ben Hutchings wrote: >>>>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>>>>> Forgive me if I'm being too repetitious as I think some of this has >>>>>> been mentioned in the past. >>>>>> >>>>>> We (and by we I mean the Ethernet part and driver) can only change >>>>>> the advertised availability of a larger MaxPayloadSize. The size is >>>>>> negotiated by both sides of the link when the link is established. >>>>>> The driver should not change the size of the link as it would be >>>>>> poking at registers outside of its scope and is controlled by the >>>>>> upstream bridge (not us). >>>>> [...] >>>>> >>>>> MaxPayloadSize (MPS) is not negotiated between devices but is >>>>> programmed by the system firmware (at least for devices present at >>>>> boot - the kernel may be responsible in case of hotplug). You can >>>>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >>>>> others) to set a policy that overrides this, but no policy will allow >>>>> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >>>>> >>>> >>>> Ben, >>>> >>>> Unfortunately I'm using 3.0.x kernel and this is not included in the >>>> kernel. >>>> So I'm trying to use ethtool modify it from eeprom to see if help or no. >>>> >>>> >>>> Todd, I'll review all MaxPayload for all devices, but need to say if it >>>> mismatch, customer could not modify it from BIOS for there was not entry >>>> at there, to test it, we have to find how to verify if this is the root >>>> cause, so still need to find the offset in eeprom. >>>> >>>> Thanks in advance, >>>> Joe >>>> >> >> > > -- Oracle <http://www.oracle.com> Joe Jin | Software Development Senior Manager | +8610.6106.5624 ORACLE | Linux and Virtualization No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/