On Thu, Sep 03, 2015 at 03:05:21PM +1000, David Gibson wrote: [snip]
> Hm.. so why can't the hypervisor code do the retrying? Aravinda replied to this earlier in the thread: "Retrying cannot be done internally in h_report_mc_err hcall: only one thread can succeed entering qemu upon parallel hcall and hence retrying inside the hcall will not allow the ibm,nmi-interlock from first CPU to succeed." I assume that this means that the big QEMU lock is held while an hcall is processed by QEMU, but I haven't checked the code myself. Actually, even if the lock is normally held, I don't see why these particular hcalls couldn't release the lock. I'll look into this. > > > Also, it looks like the vector will need at least one scratch register > > > (for the hcall number, if nothing else). Does PAPR specify what SPRGs > > > the vector can clobber? Obviously it can't be anything the guest > > > kernel uses. > > > > PAPR only says SPRGs 0 to 3 are for software use, but the kernel (see > > arch/powerpc/include/asm/reg.h) defines SPRG2 as an exception scratch > > register > > so it should be the right one to use here. > > Uh.. no. If 0..3 are for software (i.e. OS) use, then this needs to > use a different one, since it's being used as a firmware resource > here. Linux might treat SPRG2 as scratch, but another OS would be > within its rights to use it for something persistent. > > Although, as paulus points out, sc 1 will clobber SRR0/1 anyway, and > if we use a special illegal instruction, then you no longer need a > scratch register. > > > > Btw, does anyone know what happens with the VPA (and dispatch trace > > > log and so forth) on kexec() - it could be subject to the same stale > > > address problem, and rewriting vectors won't save us there. > > > > I asked Michael Ellerman this one and he thinks kexec probably frees and > > re-allocates the VPA. > > Ok. So the question is: if an explicit deregister is good enough for > the VPA, is it also good enough for the FWNMI vector, in which case > doing it with just a qemu exit and not bouncing through the guest space > is back on the table. > > I guess that's still problematic because there are existing guests > that assume a kexec() will magically wipe the fwnmi vectors away. Yes, but I think we could handle this separately if necessary: even if we don't need to write anything to the vector, we could still insert a magic value and check for it later. If it's been clobbered by a kexec, go back to the old method. Sam.