On Thu, Aug 04, 2016 at 03:58:28PM -0700, York Sun wrote: > On e500v1, read fault exception enable (RFXE) controls whether > assertion of core_fault_in causes a machine check interrupt. > Assertion of core_fault_in can result from uncorrectable data > error, such as an L2 multibit ECC error. It can also occur from > a system error if logic on the integrated device signals a fault > for nonfatal errors. RFXE bit is cleared out of reset, and should > be left clear for normal operation. Assertion of core_fault_in does > not cause a machine check. > > RFXE is set specifically for RIO (Rapid IO) and PCI for book E to > catch the errors by machine check. With this bit set, EDAC driver > can't get the interrupt in case of uncorrectable error. So this > bit is cleared in favor of EDAC. However, the benefit of catching > such uncorrectable error doesn't outweight the other errors which > may hang the system. Beside, e500v2 has different errors maksed > by RFXE, and e500mc doesn't support this bit. It is more reasonable > to leave RFXE as is in EDAC driver, and leave the uncorrectable > errors triggering machine check for e500v1.
Very nice, thanks for expanding it! Two final remarks: - please use a spell checker - now, what happens if you leave RFXE clear and mpc85xx_edac gets the error? Is it going to do proper error handling of the uncorrectable error or are we better off handling the error in the #MC interrupt handler? IOW, is mpc85xx_edac well equipped to handle those multibit errors or should we leave the current setting as is? Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --