On 6/23/19 7:44 AM, Reza Arbab wrote: > Hi Mahesh, > > On Fri, Jun 21, 2019 at 12:35:08PM +0530, Mahesh Jagannath Salgaonkar > wrote: >> On 6/21/19 6:27 AM, Santosh Sivaraj wrote: >>> - blocking_notifier_call_chain(&mce_notifier_list, 0, &evt); >>> + rc = blocking_notifier_call_chain(&mce_notifier_list, 0, evt); >>> + if (rc & NOTIFY_STOP_MASK) { >>> + evt->disposition = MCE_DISPOSITION_RECOVERED; >>> + regs->msr |= MSR_RI; >> >> What is the reason for setting MSR_RI ? I don't think this is a good >> idea. MSR_RI = 0 means system got MCE interrupt when SRR0 and SRR1 >> contents were live and was overwritten by MCE interrupt. Hence this >> interrupt is unrecoverable irrespective of whether machine check handler >> recovers from it or not. > > Good catch! I think this is an artifact from when I was first trying to > get all this working. > > Instead of setting MSR_RI, we should probably just check for it. Ie, > > if ((rc & NOTIFY_STOP_MASK) && (regs->msr & MSR_RI)) { > evt->disposition = MCE_DISPOSITION_RECOVERED;
Yup, looks good to me. Thanks, -Mahesh.