On 5/5/23 16:10, Michael Brown wrote:
> On 03/05/2023 08:19, Gerd Hoffmann wrote:
>> OVMF can't guarantee that the ASSERT() doesn't happen.  Misbehaving
>> EFI applications can trigger this.  So log a warning instead and try
>> to continue.
>>
>> Reproducer: Fetch windows 11 22H2 iso image, boot it in qemu with OVMF.
>>
>> Traced to BootServices->Stall() being called with IPL=TPL_HIGH_LEVEL
>> and Interrupts /enabled/ while windows is booting.
>>
>> Cc: Michael Brown <mc...@ipxe.org>
>> Cc: Laszlo Ersek <ler...@redhat.com>
>> Signed-off-by: Gerd Hoffmann <kra...@redhat.com>
>> ---
>>   OvmfPkg/Library/NestedInterruptTplLib/Tpl.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
>> b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
>> index e19d98878eb7..fdd7d15c4ba8 100644
>> --- a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
>> +++ b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
>> @@ -39,7 +39,9 @@ NestedInterruptRaiseTPL (
>>     //
>>     ASSERT (GetInterruptState () == FALSE);
>>     InterruptedTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL);
>> -  ASSERT (InterruptedTPL < TPL_HIGH_LEVEL);
>> +  if (InterruptedTPL >= TPL_HIGH_LEVEL) {
>> +    DEBUG ((DEBUG_WARN, "%a: Called at IPL %d\n", __func__,
>> InterruptedTPL));
>> +  }
>>       return InterruptedTPL;
>>   }
> 
> While https://bugzilla.redhat.com/show_bug.cgi?id=2189136 continues to
> track the underlying Windows bug that leads to this assertion being
> triggered: I suspect that this patch will allow people to boot these
> buggy versions of Windows in OVMF, and I don't think it will make things
> any worse.
> 
> I would probably suggest changing DEBUG_WARN to DEBUG_ERROR since this
> represents a serious invariant violation being detected.  With that change:
> 
>   Reviewed-by: Michael Brown <mc...@ipxe.org>

I don't like the patch. For two reasons:

(1) It papers over the actual issue. The problem should be fixed where
it is, if possible.

(2) With the patch applied, NestedInterruptRaiseTPL() can return
TPL_HIGH_LEVEL (as "InterruptedTPL"). Consequently,
TimerInterruptHandler() [OvmfPkg/LocalApicTimerDxe/LocalApicTimerDxe.c]
may pass TPL_HIGH_LEVEL back to NestedInterruptRestoreTPL(), as
"InterruptedTPL".

I believe that this in turn may invalidate at least one comment in
NestedInterruptRestoreTPL():

    //
    // Call RestoreTPL() to allow event notifications to be
    // dispatched.  This will implicitly re-enable interrupts.
    //
    gBS->RestoreTPL (InterruptedTPL);

Restoring TPL_HIGH_LEVEL does not re-enable interrupts -- nominally anyways.


I wouldn't like OVMF to stick with yet another workaround / yet more
internal inconsistency. We should just wait until fixed Windows
installer media gets released.

Here's an alternative:

(a) Make LocalApicTimerDxe Xen-specific again. It's only the OVMF Xen
platform that really *needs* NestedInterruptTplLib. (Don't get me wrong:
NestedInterruptTplLib is technically correct in all circumstances, but
in practice it happens to be too strict.)

(b) For the non-Xen OVMF platforms, re-create a LocalApicTimerDxe
variant that effectively has commits a086f4a63bc0 and a24fbd606125
reverted. (We should keep 9bf473da4c1d.) This returns us to
pre-239b50a86370 status -- that is, a timer interrupt handler that (a)
does not try to be smart about nested interrupts, therefore one that is
much simpler, and (b) is more tolerant of the Windows / cdboot.efi spec
violation, (c) is vulnerable to the timer interrupt storm seen on Xen,
but will never run on Xen. (Only the OVMF Xen platform is supposed to be
launched on Xen.)

Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#104158): https://edk2.groups.io/g/devel/message/104158
Mute This Topic: https://groups.io/mt/98656860/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: 
https://edk2.groups.io/g/devel/leave/9847357/21656/1706620634/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to