On 2/29/24 20:16, Kinney, Michael D wrote:

-----Original Message-----
From: Paolo Bonzini <pbonz...@redhat.com>
Sent: Thursday, February 29, 2024 11:04 AM
To: Ni, Ray <ray...@intel.com>; devel@edk2.groups.io
Cc: Kinney, Michael D <michael.d.kin...@intel.com>; Liming Gao
<gaolim...@byosoft.com.cn>; Laszlo Ersek <ler...@redhat.com>; Michael
Brown <mc...@ipxe.org>
Subject: Re: [PATCH 2/2] MdeModulePkg/DxeCore: Fix stack overflow issue
due to nested interrupts

On 2/29/24 14:02, Ray Ni wrote:
In the end, it will lower the TPL to TPL_APPLICATION with interrupt
enabled.
However, it's possible that another timer interrupt happens just in
the end
of RestoreTPL() function when TPL is TPL_APPLICATION.
How do non-OVMF platforms solve the issue?  Do they just have the same
bug as in https://bugzilla.tianocore.org/show_bug.cgi?id=4162 ?
Yes.  This same issue can be reproduced on non-OVMF platforms.

This proposal here is an attempt to integrate a common fix into the DXE Core.

I would agree conceptually that integrating the NestedInterruptTplLib work
into the DXE Core is another option.

I believe the root cause of all of these scenarios is enabling interrupts
in RestoreTPL() when processing a timer interrupt between the last processed
event and the return from the interrupt handler. Ther are some instances
of the Timer Arch Protocol implementation that call Raise/Restore TPL, so
we want a DXE Core change that is compatible with the DXE Core doing 
Raise/Restore
when processing a timer interrupt and the Timer Arch Protocol implementation
also doing the Raise/Restore TPL.
Ok, now I understand better.

The reason why the NestedInterruptTplLib was introduced (as opposed to doing it in core DXE) was to enable returning with disabled interrupts from the nested interrupt handler, but I think it can be done with a function like the CoreRestoreTplInternal() I outlined in the previous email, which is the same as current CoreRestoreTpl() but finishes with
  if (!DesiredInterruptState) {
    CoreSetInterruptState (FALSE);
  }
  gEfiCurrentTpl = NewTpl;
  if (DesiredInterruptState) {
    ASSERT (gEfiCurrentTpl < TPL_HIGH_LEVEL);
    CoreSetInterruptState (TRUE);
  }

The new CoreRaiseTpl would be the same as in Ray and your patch, while the CoreRestoreTpl would be something like this:
    if (NewTpl == HighBitSet64 (mInterruptedTplMask)) {
      static NESTED_INTERRUPT_STATE NestedInterruptState;
      mInterruptedTplMask &= ~(UINTN)(1 << NewTpl);
      //
      // Use the deferred invocation logic that is currently
      // in NestedInterruptTplLib.
      //
      // But unlike current NestedInterruptRestoreTPL(), if the logic
      // is part of core DXE, the
      //
      //    gBS->RestoreTPL (InterruptedTPL);
      //    DisableInterrupts ();
      //
      // pair that requires "disable interrupts on IRET" logic can
      // be done without ever enabling interrupts,  with
      // CoreRestoreTplInternal(InterruptedTPL, FALSE)
      //
      // As an aside, NestedInterruptState might as well become a
      // pair of globals.
      //
      NestedInterruptRestoreTPL (NewTpl, &NestedInterruptState);
    } else {
      CoreRestoreTplInternal(NewTpl, NewTpl < TPL_HIGH_LEVEL);
    }

Requiring matching raise/restore pairs is a bit scary. It can be avoided by changing the "if" to a
  while (NewTpl >= HighBitSet64 (mInterruptedTplMask))
    mInterruptedTplMask &=
      ~(UINTN)(1 << HighBitSet64 (mInterruptedTplMask));

Then, if inlining NestedInterruptRestoreTPL() allows simplifications, they can be done on top after the merge of NestedInterruptTplLib. In particular, I suspect that the while loop above can be unified with the loop in NestedInterruptRestoreTPL(). But again, that would be best reviewed as a separate change.
All this, as Michael said, is however conditional on being able to deal 
with the TPL_HIGH_LEVEL+STI shenanigans that Windows does.
Paolo

The design of NestedInterruptTplLib is that each nested interrupt must
increase the TPL, but if I understand correctly there is a hole here:

    //
    // Call RestoreTPL() to allow event notifications to be
    // dispatched.  This will implicitly re-enable interrupts.
    //
    gBS->RestoreTPL (InterruptedTPL);

    //
    // Re-disable interrupts after the call to RestoreTPL() to ensure
    // that we have exclusive access to the shared state.
    //
    DisableInterrupts ();

because gBS->RestoreTPL will unconditionally enable interrupts if
InterruptedTPL < TPL_HIGH_LEVEL.


If possible, the easiest solution would be to merge
NestedInterruptTplLib into Core DXE.  This way, instead of calling
gBS->RestoreTPL, NestedInterruptTplLib can call a custom version of
CoreRestoreTpl that exits with interrupts disabled.  That is, something
like

VOID EFIAPI CoreRestoreTplInternal(IN EFI_TPL NewTpl,
                                     IN BOOLEAN InterruptState)
{
    //
    // The caller can request disabled interrupts to access shared
    // state, but TPL_HIGH_LEVEL must *not* have them enabled.
    //
    ASSERT(!(NewTpl == TPL_HIGH_LEVEL && InterruptState));

    // ...

    gEfiCurrentTpl = NewTpl;
    CoreSetInterruptState (InterruptState);
}

Now, CoreRestoreTpl is just

    //
    // If lowering below HIGH_LEVEL, make sure
    // interrupts are enabled
    //
    CoreRestoreTplInternal(NewTpl, NewTpl < TPL_HIGH_LEVEL);

whereas NestedInterruptRestoreTPL can do

    //
    // Call RestoreTPL() to allow event notifications to be
    // dispatched.  This will implicitly re-enable interrupts,
    // but only if events have to be dispatched.
    //
    CoreRestoreTplInternal(InterruptedTPL, FALSE);

    //
    // Interrupts are now disabled, so we can access shared state.
    //

This avoids the unlimited nesting of interrupts because each stack
frame
will indeed have a higher TPL than the outer version.

Paolo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#116189): https://edk2.groups.io/g/devel/message/116189
Mute This Topic: https://groups.io/mt/104642317/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to