On 23/07/2019 16:53, Juergen Gross wrote: > On 23.07.19 17:29, Andrew Cooper wrote: >> On 23/07/2019 16:22, Juergen Gross wrote: >>> On 23.07.19 17:04, Jan Beulich wrote: >>>> On 23.07.2019 16:29, Juergen Gross wrote: >>>>> On 23.07.19 16:14, Jan Beulich wrote: >>>>>> On 23.07.2019 16:03, Jan Beulich wrote: >>>>>>> On 23.07.2019 15:44, Juergen Gross wrote: >>>>>>>> On 23.07.19 14:42, Jan Beulich wrote: >>>>>>>>> v->processor gets latched into st->processor before raising the >>>>>>>>> softirq, >>>>>>>>> but can't the vCPU be moved elsewhere by the time the softirq >>>>>>>>> handler >>>>>>>>> actually gains control? If that's not possible (and if it's not >>>>>>>>> obvious >>>>>>>>> why, and as you can see it's not obvious to me), then I think a >>>>>>>>> code >>>>>>>>> comment wants to be added there. >>>>>>>> >>>>>>>> You are right, it might be possible for the vcpu to move around. >>>>>>>> >>>>>>>> OTOH is it really important to run the target vcpu exactly on the >>>>>>>> cpu >>>>>>>> it is executing (or has last executed) at the time the NMI/MCE is >>>>>>>> being >>>>>>>> queued? This is in no way related to the cpu the MCE or NMI has >>>>>>>> been >>>>>>>> happening on. It is just a random cpu, and so it would be if we'd >>>>>>>> do the >>>>>>>> cpu selection when the softirq handler is running. >>>>>>>> >>>>>>>> One question to understand the idea nehind all that: _why_ is the >>>>>>>> vcpu >>>>>>>> pinned until it does an iret? I could understand if it would be >>>>>>>> pinned >>>>>>>> to the cpu where the NMI/MCE was happening, but this is not the >>>>>>>> case. >>>>>>> >>>>>>> Then it was never finished or got broken, I would guess. >>>>>> >>>>>> Oh, no. The #MC side use has gone away in 3a91769d6e, without >>>>>> cleaning >>>>>> up other code. So there doesn't seem to be any such requirement >>>>>> anymore. >>>>> >>>>> So just to be sure: you are fine for me removing the pinning for >>>>> NMIs? >>>> >>>> No, not the pinning as a whole. The forced CPU0 affinity should still >>>> remain. It's just that there's no correlation anymore between the CPU >>>> a vCPU was running on and the CPU it is to be pinned to (temporarily). >>> >>> I don't get it. Today vcpu0 of the hardware domain is pinned to the cpu >>> it was last running on when the NMI happened. Why is that important? >>> Or do you want to change the logic and pin vcpu0 for NMI handling >>> always >>> to CPU0? >> >> Its (allegedly) for when dom0 knows some system-specific way of getting >> extra information out of the platform, that happens to be core-specific. >> >> There are rare cases where SMI's need to be executed on CPU0, and I >> wouldn't put it past hardware designers to have similar aspects for >> NMIs. > > Understood. But today vcpu0 is _not_ bound to CPU0, but to any cpu it > happened to run on. > >> >> That said, as soon as the gaping security hole which is the >> default-readibility of all MSRs, I bet the utility of this pinning >> mechanism will be 0. > > And my reasoning is that this is the case today already, as there is > no pinning to CPU0 done, at least not on purpose.
Based on this analysis, I'd be tempted to drop the pinning completely. It clearly isn't working in a rational way. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel