Hello, On Fri, Sep 16, 2022 at 04:56:18PM -0500, Nathan Lynch wrote: > "Nicholas Piggin" <npig...@gmail.com> writes: > > On Wed Sep 14, 2022 at 3:39 AM AEST, Leonardo Brás wrote: > >> On Mon, 2022-09-12 at 14:58 -0500, Nathan Lynch wrote: > >> > Leonardo Brás <leobra...@gmail.com> writes: > >> > > On Fri, 2022-09-09 at 09:04 -0500, Nathan Lynch wrote:
> >> > > > No, it means the premise of commit b664db8e3f97 ("powerpc/rtas: > >> > > > Implement reentrant rtas call") change is incorrect. The "reentrant" > >> > > > property described in the spec applies only to the individual RTAS > >> > > > functions. The OS can invoke (for example) ibm,set-xive on multiple > >> > > > CPUs > >> > > > simultaneously, but it must adhere to the more general requirement to > >> > > > serialize with other RTAS functions. > >> > > > > >> > > > >> > > I see. Thanks for explaining that part! > >> > > I agree: reentrant calls that way don't look as useful on Linux than I > >> > > previously thought. > >> > > > >> > > OTOH, I think that instead of reverting the change, we could make use > >> > > of the > >> > > correct information and fix the current implementation. (This could > >> > > help when we > >> > > do the same rtas call in multiple cpus) > >> > > >> > Hmm I'm happy to be mistaken here, but I doubt we ever really need to do > >> > that. I'm not seeing the need. > >> > > >> > > I have an idea of a patch to fix this. > >> > > Do you think it would be ok if I sent that, to prospect being an > >> > > alternative to > >> > > this reversion? > >> > > >> > It is my preference, and I believe it is more common, to revert to the > >> > well-understood prior state, imperfect as it may be. The revert can be > >> > backported to -stable and distros while development and review of > >> > another approach proceeds. > >> > >> Ok then, as long as you are aware of the kdump bug, I'm good. > >> > >> FWIW: > >> Reviewed-by: Leonardo Bras <leobra...@gmail.com> > > > > A shame. I guess a reader/writer lock would not be much help because > > the crash is probably more likely to hit longer running rtas calls? > > > > Alternative is just cheat and do this...? > > > > Thanks, > > Nick > > > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > > index 693133972294..89728714a06e 100644 > > --- a/arch/powerpc/kernel/rtas.c > > +++ b/arch/powerpc/kernel/rtas.c > > @@ -26,6 +26,7 @@ > > #include <linux/syscalls.h> > > #include <linux/of.h> > > #include <linux/of_fdt.h> > > +#include <linux/panic.h> > > > > #include <asm/interrupt.h> > > #include <asm/rtas.h> > > @@ -97,6 +98,19 @@ static unsigned long lock_rtas(void) > > { > > unsigned long flags; > > > > + if (atomic_read(&panic_cpu) == raw_smp_processor_id()) { > > + /* > > + * Crash in progress on this CPU. Other CPUs should be > > + * stopped by now, so skip the lock in case it was being > > + * held, and is now needed for crashing e.g., kexec > > + * (machine_kexec_mask_interrupts) requires rtas calls. > > + * > > + * It's possible this could have caused rtas state > > breakage > > + * but the alternative is deadlock. > > + */ > > + return 0; > > + } > > + > > local_irq_save(flags); > > preempt_disable(); > > arch_spin_lock(&rtas.lock); > > @@ -105,6 +119,9 @@ static unsigned long lock_rtas(void) > > > > static void unlock_rtas(unsigned long flags) > > { > > + if (atomic_read(&panic_cpu) == raw_smp_processor_id()) > > + return; > > + > > arch_spin_unlock(&rtas.lock); > > local_irq_restore(flags); > > preempt_enable(); > > Looks correct. > > I wonder - would it be worth making the panic path use a separate > "emergency" rtas_args buffer as well? If a CPU is actually "stuck" in > RTAS at panic time, then leaving rtas.args untouched might make the > ibm,int-off, ibm,set-xive, ibm,os-term, and any other RTAS calls we > incur on the panic path more likely to succeed. Was some fix for the case of crashing in rtas merged? Looks like there is none unless I missed something. The paramater area allocator might help with the latter but the former does not seem addressed. Thanks Michal