> -----Original Message----- > From: Wood Scott-B07421 > Sent: Wednesday, March 06, 2013 2:48 AM > To: Jia Hongtao-B38951 > Cc: Wood Scott-B07421; Stuart Yoder; linuxppc-dev@lists.ozlabs.org; Kumar > Gala > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix > PCIe erratum on mpc85xx > > On 03/05/2013 04:12:30 AM, Jia Hongtao-B38951 wrote: > > > > > > > -----Original Message----- > > > From: Wood Scott-B07421 > > > Sent: Tuesday, March 05, 2013 7:46 AM > > > To: Stuart Yoder > > > Cc: Jia Hongtao-B38951; linuxppc-dev@lists.ozlabs.org; Kumar Gala > > > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to > > fix > > > PCIe erratum on mpc85xx > > > > > > On 03/04/2013 10:16:10 AM, Stuart Yoder wrote: > > > > On Mon, Mar 4, 2013 at 2:40 AM, Jia Hongtao <b38...@freescale.com> > > > > wrote: > > > > > A PCIe erratum of mpc85xx may causes a core hang when a link of > > PCIe > > > > > goes down. when the link goes down, Non-posted transactions > > issued > > > > > via the ATMU requiring completion result in an instruction > > stall. > > > > > At the same time a machine-check exception is generated to the > > core > > > > > to allow further processing by the handler. We implements the > > > > handler > > > > > which skips the instruction caused the stall. > > > > > > > > Can you explain at a high level how just skipping an instruction > > > > solves > > > > anything? If you just skip a load/store and continue like > > nothing is > > > > wrong, isn't your system possibly in a really bad state. > > > > > > If the instruction was a load, we probably at least want to fill the > > > destination register with 0xffffffff or similar. > > > > You discuss this with Liu Shuo about a year ago. > > here is the log: > > > > " > > On 02/01/2012 02:18 AM, shuo....@freescale.com wrote: > > > v3 : Skip the instruction only. Don't access the user space memory > > in > > > mechine check. > > > > It may be the least bad option for now, but be aware that there's a > > small chance that this will cause a leak of sensitive information > > (such as a piece of a crypto key that happened to be sitting in the > > register to be loaded into). > > Yes, that's (one reason) why you'd want to fill in a known value. Note > the "for now". :-) > > -Scott
I think there is no overwhelming reason to fill the destination register with 0xffffffff. There's a small chance that 0xffffffff is treated as regular data rather than an error sign. Also setting this register may influence the user space under certain circumstance. So I think just ignore the skipped instruction is an acceptable option for this fix. -Hongtao. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev