On 17/01/2024 07:11, Ni, Ray wrote:
The above flow shows endless re-entrance of timer interrupt handler.
But, my question is: above flow only can happen in real platform when the below
4 steps occupies more time than the timer period (usually 10ms).
[Timer Interrupt #2]1. RaiseTPL (HIGH) from NOTIFY causing CPU
interrupt be disabled.
[Timer Interrupt #2]2. Send APIC EOI (ACK the interrupt received
so APIC can continue generate interrupts)
[Timer Interrupt #2]3. Call DxeCore::CoreTimerTick()
[Timer Interrupt #2]4. RestoreTPL (NOTIFY) from HIGH. No callback
runs as no callback can be registered at TPL > NOTIFY. In the end of
RestoreTPL(), CPU interrupt is enabled.
But, in my opinion, it's impossible.
As is thoroughly documented in NestedInterruptRestoreTpl(), the
potential for unbounded stack consumption arises when an interrupt
occurs after the point that RestoreTPL() completes dispatching all
notifications but before the IRET (or equivalent) instruction pops the
original stack frame.
Since dispatching notifications can take an unbounded amount of time,
there is absolutely no guarantee that this will be less than 10ms after
the previous interrupt. It could easily be 30 seconds later.
The problematic flow is a subtle variation on what you described:
[IRQ#1] timer interrupt at TPL_APPLICATION
[ISR#1] RaiseTPL from TPL_APPLICATION -> TPL_HIGH_LEVEL
[ISR#1] Send APIC EOI
[ISR#1] Call CoreTimerTick()
[ISR#1] RestoreTPL from TPL_HIGH_LEVEL -> TPL_APPLICATION
[ISR#1] Callbacks for TPL_NOTIFY are run
[ISR#1] Callbacks for TPL_CALLBACK are run
... these may take several *seconds* to complete, during
which further interrupts are raised, the details of
which are not shown here...
[ISR#1] TPL is now restored to TPL_APPLICATION
[IRQ#N] timer interrupt at TPL_APPLICATION
[ISR#N] RaiseTPL from TPL_APPLICATION -> TPL_HIGH_LEVEL
... continues ...
The root cause is that the ISR reaches a state in which:
a) an arbitrary amount of time has elapsed since the triggering
interrupt (due to unknown callbacks being invoked, which may
themselves wait for further timer interrupts), and
b) the TPL has been fully restored back to the TPL at the point
the triggering interrupt occurred (i.e. TPL_APPLICATION in
this example), and
c) the timer interrupt source is enabled, and
d) CPU interrupts are enabled
At this point, there is nothing preventing another interrupt from
occurring. It will occur at TPL_APPLICATION and it will be one stack
frame deeper than the previous interrupt at TPL_APPLICATION.
Rinse and repeat, and you have unbounded stack consumption.
Hence the requirement for NestedInterruptTplLib, even on physical hardware.
Michael
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#113947): https://edk2.groups.io/g/devel/message/113947
Mute This Topic: https://groups.io/mt/103734961/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-