On 18/08/2025 11:02 am, Jan Beulich wrote:
> On 09.08.2025 01:49, Andrew Cooper wrote:
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -4209,8 +4209,18 @@ void asmlinkage vmx_vmexit_handler(struct 
>> cpu_user_regs *regs)
>>               ((intr_info & INTR_INFO_INTR_TYPE_MASK) ==
>>                MASK_INSR(X86_ET_NMI, INTR_INFO_INTR_TYPE_MASK)) )
>>          {
>> -            do_nmi(regs);
>> -            enable_nmis();
>> +            /*
>> +             * If we exited because of an NMI, NMIs are blocked in hardware,
>> +             * but software is expected to invoke the handler.
>> +             *
>> +             * Use INT $2.  Combined with the current state, it is the 
>> correct
>> +             * architectural state for the NMI handler,
> Not quite, I would say: For profiling (and anything else which may want to
> look at the outer context's register state from within the handler) we'd
> always appear to have been in Xen when the NMI "occurred".

We are always inside Xen when the NMI "occurred".

In fact there's a latent bug I didn't spot before.  Nothing appears to,
but if anything in do_nmi() were to to look at regs->entry_vector, it
will see stack rubble (release build) or poison (debug build).

Having gone searching, it's only the watchdog and oprofile which
configure perf counters with NMIs.  vPMU uses fixed interrupts, which
further calls into question it's utility.

>
>> and the IRET on the
>> +             * way back out will unblock NMIs.
>> +             *
>> +             * In FRED mode, we can spot this trick and cause the ERETS to
>> +             * unblock NMIs too.
>> +             */
>> +            asm ("int $2");
>>          }
>>          break;
>>      case EXIT_REASON_MCE_DURING_VMENTRY:
>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -2285,8 +2285,22 @@ void asmlinkage entry_from_xen(struct cpu_user_regs 
>> *regs)
>>          do_nmi(regs);
>>          break;
>>  
>> -    case X86_ET_HW_EXC:
>>      case X86_ET_SW_INT:
>> +        if ( regs->fred_ss.vector == 2 )
>> +        {
>> +            /*
>> +             * Explicit request from the the VMExit handler.  Rewrite the 
>> FRED
>> +             * frame to look like it was a real NMI, and go around again.
>> +             */
>> +            regs->fred_ss.swint = false;
>> +            regs->fred_ss.nmi = true;
>> +            regs->fred_ss.type = X86_ET_NMI;
>> +            regs->fred_ss.insnlen = 0;
>> +
>> +            return entry_from_xen(regs);
> Any particular reason to use recursion here (which the compiler may or may
> not transform)? In fact I'm having trouble seeing why you couldn't invoke
> do_nmi() here directly.

The first way I had entry_from_xen(), this was necessary to get the
right behaviour.  GCC did manage to transform it into a call to do_nmi().

But this has changed somewhat now so I think I can do it with a fallthrough.

~Andrew

Reply via email to