Nicholas Piggin <npig...@gmail.com> writes: > The OPAL call wrapper gets interrupt disabling wrong. It disables > interrupts just by clearing MSR[EE], which has two problems: > > - It doesn't call into the IRQ tracing subsystem, which means tracing > across OPAL calls does not always notice IRQs have been disabled. > > - It doesn't go through the IRQ soft-mask code, which causes a minor > bug. MSR[EE] can not be restored by saving the MSR then clearing > MSR[EE], because a racing interrupt while soft-masked could clear > MSR[EE] between the two steps. This can cause MSR[EE] to be > incorrectly enabled when the OPAL call returns. Fortunately that > should only result in another masked interrupt being taken to > disable MSR[EE] again, but it's a bit sloppy. > > The existing code also saves MSR to PACA, which is not re-entrant if > there is a nested OPAL call from different MSR contexts, which can > happen these days with SRESET interrupts on bare metal. > > To fix these issues, move the tracing and IRQ handling code to C, and > call into asm just for the low level call when everything is ready to > go. Save the MSR on stack rather than PACA. > > Performance cost is kept to a minimum with a few optimisations: > > - The endian switch upon return is combined with the MSR restore, > which avoids an expensive context synchronizing operation for LE > kernels. This makes up for the additional mtmsrd to enable > interrupts with local_irq_enable(). > > - blr is now used to return from the opal_* functions that are called > as C functions, to avoid link stack corruption. This requires a > skiboot fix as well to keep the call stack balanced. > > A NULL call is more costly after this, (410ns->430ns on POWER9), but > OPAL calls are generally not performance critical at this scale.
At least with this patch we can start to better measure things, and that's a big plus. Also, When I get to start worrying about <1ms OPAL calls, I think we can revisit optimising this path :) -- Stewart Smith OPAL Architect, IBM.