On 8/29/23 15:29, Ard Biesheuvel wrote: > Laszlo reports that the efi_gdb.py script fails to produce a full > backtrace when attaching it to an ARM firmware build that has halted on > an unhandled exception. > > The reason is that the asm code that processes the exception was not > implemented with this in mind, and therefore lacks any handling of it. > > So let's add this: create a dummy frame record suitable for chasing the > frame pointer, and add the CFI metadata to describe where the return > value can be found on the stack. > > When using a GCC5 build, this produces a stack trace such as > > (gdb) bt > #0 0x000000007fd4537c in CpuDeadLoop () at > /home/ardb/build/edk2/MdePkg/Library/BaseLib/CpuDeadLoop.c:30 > #1 0x000000007fd454f8 in DebugAssert ( > FileName=FileName@entry=0x7fd4a8a8 <MmioWrite64Internal+3604> > "/home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c", > LineNumber=LineNumber@entry=343, > Description=Description@entry=0x7fd4a896 <MmioWrite64Internal+3586> > "((BOOLEAN)(0==1))") > at > /home/ardb/build/edk2/MdePkg/Library/BaseDebugLibSerialPort/DebugLib.c:235 > #2 0x000000007fd479ec in DefaultExceptionHandler (ExceptionType=<optimized > out>, SystemContext=...) > at > /home/ardb/build/edk2/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c:343 > #3 0x000000007fd48eb8 in ExceptionHandlersEnd () > #4 0x000000007fcde944 in QemuLoadKernelImage (ImageHandle=<synthetic > pointer>) at > /home/ardb/build/edk2/OvmfPkg/Library/GenericQemuLoadImageLib/GenericQemuLoadImageLib.c:201 > #5 TryRunningQemuKernel () at > /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/QemuKernel.c:46 > #6 PlatformBootManagerAfterConsole () at > /home/ardb/build/edk2/ArmVirtPkg/Library/PlatformBootManagerLib/PlatformBm.c:1139 > #7 BdsEntry (This=<optimized out>) at > /home/ardb/build/edk2/MdeModulePkg/Universal/BdsDxe/BdsEntry.c:931 > #8 0x000000007ffd0018 in ?? () > Backtrace stopped: previous frame inner to this frame (corrupt stack?) > > when QemuLoadKernelImage() has been tweaked to trigger an exception, as > is shown by GDB when walking the call stack: > > | 0x7fcde938 <BdsEntry+3292> b.ne 0x7fcdf134 <BdsEntry+5336> // > b.any > | 0x7fcde93c <BdsEntry+3296> mov x0, #0x40 > // #64 > | 0x7fcde940 <BdsEntry+3300> bl 0x7fcd7aec <DebugPrint> > | > 0x7fcde944 <BdsEntry+3304> brk #0x4d2 > | 0x7fcde948 <BdsEntry+3308> bl 0x7fce4354 > <ConnectDevicesFromQemu> > | 0x7fcde94c <BdsEntry+3312> tbz x0, #63, 0x7fcde954 > <BdsEntry+3320> > | 0x7fcde950 <BdsEntry+3316> bl 0x7fcd844c > <EfiBootManagerConnectAll> > | 0x7fcde954 <BdsEntry+3320> bl 0x7fcd990c > <EfiBootManagerRefreshAllBootOption > > Unfortunately, CLANGDWARF does not seem entirely happy with this > arrangement: it identifies the call frame where the exception > originated, but does not show any frames above that. (This could be > related to the fact that the exception code uses a separate exception > stack for handling synchronous exceptions)
First of all, thanks for writing this patch so incredibly quickly. :) Second, something must be off with my gdb. Before your patch, I kept experimenting with manually resetting FP, SP, and LR to the values printed in the register dump, using gdb "set" commands. Strangely, that did result in complete pre-exception stack traces, but *only sometimes*. Most of the time gdb complains about "corrupted stack". And I just can't figure out what distinguishes the broken from the functional "bt" commands -- I did walk the allegedly corrupt stack manually, and there is nothing corrupt in the FP and LR parts of the stack frames. They all chain nicely and point to valid instructions, respectively. I don't know what it is that gdb doesn't like. Third, when I test your patch, I seem to experience precisely what you describe under CLANGDWARF -- it shows the faulting frame (the frame just before the exception), but nothing before it! And I'm not building with clang :( Thanks, Laszlo > > Signed-off-by: Ard Biesheuvel <a...@kernel.org> > --- > ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S | 18 > +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S > b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S > index cd9437b6aab8..345b566932bb 100644 > --- a/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S > +++ b/ArmPkg/Library/ArmExceptionLib/AArch64/ExceptionSupport.S > @@ -259,6 +259,8 @@ ASM_PFX(ExceptionHandlersEnd): > > > ASM_PFX(CommonExceptionEntry): > + .cfi_sections .debug_frame > + .cfi_startproc > > EL1_OR_EL2_OR_EL3(x1) > 1:mrs x2, elr_el1 // Exception Link Register > @@ -280,6 +282,13 @@ ASM_PFX(CommonExceptionEntry): > > 4:mrs x4, fpsr // Floating point Status Register 32bit > > + // Create a dummy frame record using the ELR as the return address > + stp x29, x2, [sp, #-16]! > + .cfi_def_cfa_offset (GP_CONTEXT_SIZE + FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE > + 16) > + .cfi_rel_offset x29, 0 > + .cfi_rel_offset x30, 8 > + mov x29, sp > + > // Save the SYS regs > stp x2, x3, [x28, #-SYS_CONTEXT_SIZE]! > stp x4, x5, [x28, #0x10] > @@ -305,7 +314,7 @@ ASM_PFX(CommonExceptionEntry): > > // x0 still holds the exception type. > // Set x1 to point to the top of our struct on the Stack > - mov x1, sp > + add x1, sp, #16 > > // CommonCExceptionHandler ( > // IN EFI_EXCEPTION_TYPE ExceptionType, R0 > @@ -318,6 +327,9 @@ ASM_PFX(CommonExceptionEntry): > // We do not try to recover. > bl ASM_PFX(CommonCExceptionHandler) // Call exception handler > > + // Pop dummy frame record > + add sp, sp, #16 > + > // Pop as many GP regs as we can before entering the critical section below > ldp x2, x3, [sp, #0x10] > ldp x4, x5, [sp, #0x20] > @@ -378,13 +390,17 @@ ASM_PFX(CommonExceptionEntry): > > // pop remaining GP regs and return from exception. > ldr x30, [sp, #0xf0 - 0xe0] > + .cfi_restore 30 > ldp x28, x29, [sp], #GP_CONTEXT_SIZE - 0xe0 > + .cfi_restore 29 > > // Adjust SP to be where we started from when we came into the handler. > // The handler can not change the SP. > add sp, sp, #FP_CONTEXT_SIZE + SYS_CONTEXT_SIZE > + .cfi_def_cfa_offset 0 > > eret > + .cfi_endproc > > ASM_FUNC(RegisterEl0Stack) > msr sp_el0, x0 -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#108096): https://edk2.groups.io/g/devel/message/108096 Mute This Topic: https://groups.io/mt/101030910/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/9847357/21656/1706620634/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-