On 5/5/23 16:10, Michael Brown wrote: > On 03/05/2023 08:19, Gerd Hoffmann wrote: >> OVMF can't guarantee that the ASSERT() doesn't happen. Misbehaving >> EFI applications can trigger this. So log a warning instead and try >> to continue. >> >> Reproducer: Fetch windows 11 22H2 iso image, boot it in qemu with OVMF. >> >> Traced to BootServices->Stall() being called with IPL=TPL_HIGH_LEVEL >> and Interrupts /enabled/ while windows is booting. >> >> Cc: Michael Brown <mc...@ipxe.org> >> Cc: Laszlo Ersek <ler...@redhat.com> >> Signed-off-by: Gerd Hoffmann <kra...@redhat.com> >> --- >> OvmfPkg/Library/NestedInterruptTplLib/Tpl.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> index e19d98878eb7..fdd7d15c4ba8 100644 >> --- a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> +++ b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c >> @@ -39,7 +39,9 @@ NestedInterruptRaiseTPL ( >> // >> ASSERT (GetInterruptState () == FALSE); >> InterruptedTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL); >> - ASSERT (InterruptedTPL < TPL_HIGH_LEVEL); >> + if (InterruptedTPL >= TPL_HIGH_LEVEL) { >> + DEBUG ((DEBUG_WARN, "%a: Called at IPL %d\n", __func__, >> InterruptedTPL)); >> + } >> return InterruptedTPL; >> } > > While https://bugzilla.redhat.com/show_bug.cgi?id=2189136 continues to > track the underlying Windows bug that leads to this assertion being > triggered: I suspect that this patch will allow people to boot these > buggy versions of Windows in OVMF, and I don't think it will make things > any worse. > > I would probably suggest changing DEBUG_WARN to DEBUG_ERROR since this > represents a serious invariant violation being detected. With that change: > > Reviewed-by: Michael Brown <mc...@ipxe.org>
I don't like the patch. For two reasons: (1) It papers over the actual issue. The problem should be fixed where it is, if possible. (2) With the patch applied, NestedInterruptRaiseTPL() can return TPL_HIGH_LEVEL (as "InterruptedTPL"). Consequently, TimerInterruptHandler() [OvmfPkg/LocalApicTimerDxe/LocalApicTimerDxe.c] may pass TPL_HIGH_LEVEL back to NestedInterruptRestoreTPL(), as "InterruptedTPL". I believe that this in turn may invalidate at least one comment in NestedInterruptRestoreTPL(): // // Call RestoreTPL() to allow event notifications to be // dispatched. This will implicitly re-enable interrupts. // gBS->RestoreTPL (InterruptedTPL); Restoring TPL_HIGH_LEVEL does not re-enable interrupts -- nominally anyways. I wouldn't like OVMF to stick with yet another workaround / yet more internal inconsistency. We should just wait until fixed Windows installer media gets released. Here's an alternative: (a) Make LocalApicTimerDxe Xen-specific again. It's only the OVMF Xen platform that really *needs* NestedInterruptTplLib. (Don't get me wrong: NestedInterruptTplLib is technically correct in all circumstances, but in practice it happens to be too strict.) (b) For the non-Xen OVMF platforms, re-create a LocalApicTimerDxe variant that effectively has commits a086f4a63bc0 and a24fbd606125 reverted. (We should keep 9bf473da4c1d.) This returns us to pre-239b50a86370 status -- that is, a timer interrupt handler that (a) does not try to be smart about nested interrupts, therefore one that is much simpler, and (b) is more tolerant of the Windows / cdboot.efi spec violation, (c) is vulnerable to the timer interrupt storm seen on Xen, but will never run on Xen. (Only the OVMF Xen platform is supposed to be launched on Xen.) Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#104158): https://edk2.groups.io/g/devel/message/104158 Mute This Topic: https://groups.io/mt/98656860/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/9847357/21656/1706620634/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-