Balbir Singh <bsinghar...@gmail.com> writes: > On MCE the current code will restart the machine with > ppc_md.restart(). This case was extremely unlikely since > prior to that a skiboot call is made and that resulted in > a checkstop for analysis. > > With newer skiboots, on P9 we don't checkstop the box by > default, instead we return back to the kernel to extract > useful information at the time of the MCE. While we still > get this information, this patch converts the restart to > a panic(), so that if configured a dump can be taken and > we can track and probably debug the potential issue causing > the MCE.
I agree with the patch, although I'd be nervous stating that skiboot is going to keep this behaviour. In *theory* we should only ever get a platform error when there's actually something that isn't the kernel's fault. Like any firmware promise though, it's slightly less reliable than one from a politician. I'd say that in this case deferring to policy on what to do in event of panic() is the right thing. -- Stewart Smith OPAL Architect, IBM.