On Tue, 18 Apr 2023, Michael Schmitz wrote: > Am 18.04.2023 um 14:04 schrieb Finn Thain: > > On Tue, 18 Apr 2023, Michael Schmitz wrote: > >> Am 16.04.2023 um 18:44 schrieb Finn Thain: > >> > >>> 0xeffff750: 0xc01a0000 saved $a5 == libc .got > >>> 0xeffff74c: 0xc0023e8c saved $a3 == > >>> &__stack_chk_guard > >>> 0xeffff748: 0x00000000 saved $a2 > >>> 0xeffff744: 0x00000001 saved $d5 > >>> 0xeffff740: 0xeffff86e saved $d4 > >>> 0xeffff73c: 0xeffff86a saved $d3 > >>> 0xeffff738: 0x00000002 saved $d2 > >>> 0xeffff734: 0x00000000 > >>> 0xeffff730: 0x00000000 > >>> 0xeffff72c: 0x00000000 > >>> 0xeffff728: 0x00000000 > >>> 0xeffff724: 0x00000000 > >>> 0xeffff720: 0x00000000 > >>> 0xeffff71c: 0x00000000 > >>> 0xeffff718: 0x00000000 > >>> 0xeffff714: 0x00000000 > >>> 0xeffff710: 0x00000000 > >>> 0xeffff70c: 0x00000000 > >>> 0xeffff708: 0x00000000 > >>> 0xeffff704: 0x00000000 > >>> 0xeffff700: 0x00000000 > >>> 0xeffff6fc: 0x00000000 > >>> 0xeffff6f8: 0x00000000 > >>> 0xeffff6f4: 0x00000000 > >>> 0xeffff6f0: 0x00000000 > >>> 0xeffff6ec: 0x00000000 > >>> 0xeffff6e8: 0x00000000 > >>> 0xeffff6e4: 0x00000000 > >>> 0xeffff6e0: 0x00000000 > >>> 0xeffff6dc: 0x00000000 > >>> 0xeffff6d8: 0x00000000 > >>> 0xeffff6d4: 0x00000000 > >>> 0xeffff6d0: 0x00000000 > >>> 0xeffff6cc: 0x00000000 > >>> 0xeffff6c8: 0x00000000 > >>> 0xeffff6c4: 0x00000000 > >>> 0xeffff6c0: 0x00000000 > >>> 0xeffff6bc: 0x00000000 > >>> 0xeffff6b8: 0x00000000 > >>> 0xeffff6b4: 0x00000000 > >>> 0xeffff6b0: 0x00000000 > >>> 0xeffff6ac: 0x00000000 > >>> 0xeffff6a8: 0x00000000 > >>> 0xeffff6a4: 0x00000000 > >>> 0xeffff6a0: 0x00000000 > >>> 0xeffff69c: 0x00000000 > >>> 0xeffff698: 0x00000000 > >>> 0xeffff694: 0x00000000 > >>> 0xeffff690: 0x00000000 > >>> 0xeffff68c: 0x00000000 > >>> 0xeffff688: 0x00000000 > >>> 0xeffff684: 0x00000000 > >>> 0xeffff680: 0x00000000 > >>> 0xeffff67c: 0x00000000 > >>> 0xeffff678: 0x00000000 > >>> 0xeffff674: 0x00000000 > >>> 0xeffff670: 0x00000000 > >>> 0xeffff66c: 0x00000000 > >>> 0xeffff668: 0x00000000 > >>> 0xeffff664: 0x00000000 > >>> 0xeffff660: 0x41000000 > >>> 0xeffff65c: 0x00000000 > >>> 0xeffff658: 0x00000000 > >>> 0xeffff654: 0x00000000 > >>> 0xeffff650: 0x00000000 > >>> 0xeffff64c: 0x80000000 > >>> 0xeffff648: 0x3fff0000 > >>> 0xeffff644: 0x00000000 > >>> 0xeffff640: 0xd0000000 > >>> 0xeffff63c: 0x40020000 <= (sc.formatvec & 0xffff) << > >>> 16; fpregs from here on > >>> 0xeffff638: 0x81b60080 <= (sc.pc & 0xffff) << 16 | > >>> sc.formatvec >> 16 > >>> 0xeffff634: 0x0000c00e <= sc.sr << 16 sc.pc >> 16 > >>> 0xeffff630: 0xd001e4e3 <= sc.a1 > >>> 0xeffff62c: 0xc0028780 <= sc.a0 > >>> 0xeffff628: 0xffffffff <= sc.d1 > >>> 0xeffff624: 0x0000041f <= sc.d0 > >>> 0xeffff620: 0xeffff738 <= sc.usp > >>> 0xeffff61c: 0x00000000 <= sc.mask > >>> 0xeffff618: 0x00000000 <= extramask > >>> 0xeffff614: 0x00000000 <= frame.retcode[1] > >>> 0xeffff610: 0x70774e40 moveq #119,%d0 ; trap #0 > >>> 0xeffff60c: 0xeffff61c <= frame->sc > >>> 0xeffff608: 0x00000080 <= tregs->vector > >>> 0xeffff604: 0x00000011 <= signal no. > >>> 0xeffff600: 0xeffff610 return address > >>> > >>> The above comes from dash running under gdb under qemu, which does > >>> not exhibit the failure but is convenient for that kind of > >>> experiment. > >> > >> I would have expected to see a different signal trampoline (for > >> sys_rt_sigreturn) ... > > > > Well, this seems to be the trampoline from setup_frame() and not > > setup_rt_frame(). > > According to the manpages I've seen, glibc ought to pick rt signals if > the kernel supports those (which I suppose it does). >
It's got to be the trampoline from setup_frame() because dash did this: act.sa_flags = 0; sigfillset(&act.sa_mask); sigaction(signo, &act, 0); and the kernel did this: /* set up the stack frame */ if (ksig->ka.sa.sa_flags & SA_SIGINFO) err = setup_rt_frame(ksig, oldset, regs); else err = setup_frame(ksig, oldset, regs); > > > >> But anyway: > >> > >> The saved pc is 0xc00e81b6 which does match the backtrace above. > >> Vector offset 80 matches trap 0 which suggests 0xc00e81b6 should be > >> the instruction after a trap 0 instruction. d0 is 1055 which is not a > >> signal number I recognize. > >> > > > > I don't know what d0 represents here. But &frame->sig == 0x11 is > > correct (SIGCHLD). > > Correct - that all works out. But d0 holds the syscall number when we > enter the kernel via trap 0, and that one is odd. > Well, you showed subsequently that the kernel was probably entered via a page fault and not the get_thread_area trap. Would that explain the d0 value? > >>> ... > >>> > >>> Here's some stack memory from the core dump. > >>> > >>> 0xeffff0dc: 0xd000c38e return address waitproc+124 > >>> 0xeffff0d8: 0xd001c1ec frame 0 $fp == &suppressint > >>> 0xeffff0d4: 0x00add14b canary > >>> 0xeffff0d0: 0x00000000 > >>> 0xeffff0cc: 0x0000000a > >>> 0xeffff0c8: 0x00000202 > >>> 0xeffff0c4: 0x00000008 > >>> 0xeffff0c0: 0x00000000 > >>> 0xeffff0bc: 0x00000000 > >>> 0xeffff0b8: 0x00000174 > >>> 0xeffff0b4: 0x00000004 > >>> 0xeffff0b0: 0x00000004 > >>> 0xeffff0ac: 0x00000006 > >>> 0xeffff0a8: 0x000000e0 > >>> 0xeffff0a4: 0x000000e0 > >>> 0xeffff0a0: 0x00171f20 > >>> 0xeffff09c: 0x00171f20 > >>> 0xeffff098: 0x00171f20 > >>> 0xeffff094: 0x00000002 > >>> 0xeffff090: 0x00002000 > >>> 0xeffff08c: 0x00000006 > >>> 0xeffff088: 0x0000e920 > >>> 0xeffff084: 0x00005360 > >>> 0xeffff080: 0x00170700 > >>> 0xeffff07c: 0x00170700 > >>> 0xeffff078: 0x00170700 frame 0 $fp - 96 > >>> 0xeffff074: 0xd001b874 saved $a5 == dash .got > >>> 0xeffff070: 0xd001e498 saved $a3 == > >>> &dash_errno > >>> 0xeffff06c: 0xd001e718 frame 0 $sp saved $a2 == > >>> &gotsigchld > >>> 0xeffff068: 0x00000000 > >>> 0xeffff064: 0x00000000 > >>> 0xeffff060: 0xeffff11e > >>> 0xeffff05c: 0xffffffff > >>> 0xeffff058: 0xc00e4164 return address __wait3+244 > >>> 0xeffff054: 0x00add14b canary > >>> 0xeffff050: 0x00000001 > >>> 0xeffff04c: 0x00000004 > >>> 0xeffff048: 0x0000000d > >>> 0xeffff044: 0x0000000d > >>> 0xeffff040: 0x0015ef82 > >>> 0xeffff03c: 0x0015ef82 > >>> 0xeffff038: 0x0015ef82 > >>> 0xeffff034: 0x00000003 > >>> 0xeffff030: 0x00000004 > >>> 0xeffff02c: 0x00000004 > >>> 0xeffff028: 0x00000140 > >>> 0xeffff024: 0x00000140 > >>> 0xeffff020: 0x00000034 > >>> 0xeffff01c: 0x00000034 > >>> 0xeffff018: 0x00000034 > >>> 0xeffff014: 0x00000006 > >>> 0xeffff010: 0x003b003a > >>> 0xeffff00c: 0x000a0028 > >>> 0xeffff008: 0x00340020 > >>> 0xeffff004: 0xc019c000 saved $a5 == libc .got > >>> 0xeffff000: 0xeffff068 saved $a3 (corrupted) > >>> 0xefffeffc: 0x00000000 saved $a2 > >>> 0xefffeff8: 0x00000001 saved $d5 > >>> 0xefffeff4: 0xeffff122 saved $d4 > >>> 0xefffeff0: 0xeffff11e saved $d3 > >>> 0xefffefec: 0x00000000 saved $d2 > >>> 0xefffefe8: 0xc00e419a return address __GI___wait4_time64+38 > >>> 0xefffefe4: 0xc0028780 > >>> 0xefffefe0: 0x3c344bfb > >>> 0xefffefdc: 0x000af353 > >>> 0xefffefd8: 0x3c340170 > >>> 0xefffefd4: 0x00000000 > >>> 0xefffefd0: 0xc00e417c > >>> 0xefffefcc: 0xc00e417e > >>> 0xefffefc8: 0xc00e4180 > >>> 0xefffefc4: 0x48e73c34 > >>> 0xefffefc0: 0x00000000 > >>> 0xefffefbc: 0xefffeff8 > >>> 0xefffefb8: 0xefffeffc > >>> 0xefffefb4: 0x4bfb0170 > >>> 0xefffefb0: 0x0eee0709 > >>> 0xefffefac: 0x00000000 > >>> 0xefffefa8: 0x00000000 > >>> 0xefffefa4: 0x00000000 > >>> 0xefffefa0: 0x00000000 > >>> 0xefffef9c: 0x00000000 > >>> 0xefffef98: 0x00000000 > >>> 0xefffef94: 0x00000000 > >>> 0xefffef90: 0x00000000 > >>> 0xefffef8c: 0x00000000 > >>> 0xefffef88: 0x00000000 > >>> 0xefffef84: 0x00000000 > >>> 0xefffef80: 0x00000000 > >>> 0xefffef7c: 0x00000000 > >>> 0xefffef78: 0x00000000 > >>> 0xefffef74: 0x00000000 > >>> 0xefffef70: 0x00000000 > >>> 0xefffef6c: 0x00000000 > >>> 0xefffef68: 0x00000000 > >>> 0xefffef64: 0x00000000 > >>> 0xefffef60: 0x00000000 > >>> 0xefffef5c: 0x00000000 > >>> 0xefffef58: 0x00000000 > >>> 0xefffef54: 0x00000000 > >>> 0xefffef50: 0x00000000 > >>> 0xefffef4c: 0x00000000 > >>> 0xefffef48: 0x00000000 > >>> 0xefffef44: 0x00000000 > >>> 0xefffef40: 0x00000000 > >>> 0xefffef3c: 0x00000000 > >>> 0xefffef38: 0x00000000 > >>> 0xefffef34: 0x00000000 > >>> 0xefffef30: 0x00000000 > >>> 0xefffef2c: 0x00000000 > >>> 0xefffef28: 0x00000000 > >>> 0xefffef24: 0x00000000 > >>> 0xefffef20: 0x00000000 > >>> 0xefffef1c: 0x00000000 > >>> 0xefffef18: 0x00000000 > >>> 0xefffef14: 0x00000000 > >>> 0xefffef10: 0x7c0effff > >>> 0xefffef0c: 0xffffffff > >>> 0xefffef08: 0xaaaaaaaa > >>> 0xefffef04: 0xaf54eaaa > >>> 0xefffef00: 0x40040000 > >>> 0xefffeefc: 0x40040000 > >>> 0xefffeef8: 0x2b000000 > >>> 0xefffeef4: 0x00000000 > >>> 0xefffeef0: 0x00000000 > >>> 0xefffeeec: 0x408ece9a > >>> 0xefffeee8: 0x00000000 > >>> 0xefffeee4: 0xf0ff0000 > >>> 0xefffeee0: 0x0f800000 > >>> 0xefffeedc: 0xf0fff0ff > >>> 0xefffeed8: 0x1f380000 > >>> 0xefffeed4: 0x00000000 > >>> 0xefffeed0: 0x00000000 > >>> 0xefffeecc: 0x00000000 > >>> 0xefffeec8: 0xffffffff > >>> 0xefffeec4: 0xffffffff > >>> 0xefffeec0: 0x7fff0000 > >>> 0xefffeebc: 0xffffffff > >>> 0xefffeeb8: 0xffffffff > >>> 0xefffeeb4: 0x7fff0000 sc_formatvec > >>> > >>> The signal frame is not readily apparent (to me). > >> > >> From looking at the above stack dump, sc ought to start at 0xefffee90, > >> and the trampoline would be three words below that. > > > > 0xefffeeb0: 0x4178b008 sc_pc, sc_formatvec > > 0xefffeeac: 0x0008c00e sc_sr, sc_pc > > 0xefffeea8: 0xd00223bb sc_a1 > > 0xefffeea4: 0xd001e32c sc_a0 > > 0xefffeea0: 0x00000003 sc_d1 > > 0xefffee9c: 0xeffff11e sc_d0 > > 0xefffee98: 0xeffff004 sc_usp > > 0xefffee94: 0x00000000 sc_mask > > 0xefffee90: 0x00000000 extramask > > 0xefffee8c: 0xc0024a90 retcode[1] > > 0xefffee88: 0x70774e40 retcode[0] > > 0xefffee84: 0xefffee94 psc > > 0xefffee80: 0x00000008 code > > 0xefffee7c: 0x00000011 sig > > 0xefffee78: 0xefffee88 pretcode > > OK, that's our SIGCHLD. But the signal frame format is odd ... > > Frame format b, vector offset 008. That's a bus error? > How does that get on the user mode stack? > > > 0xefffee74: 0xc019c000 > > 0xefffee70: 0x00000000 > > 0xefffee6c: 0xc0025878 > > 0xefffee68: 0xc0007ed4 > > 0xefffee64: 0xc0024000 > > 0xefffee60: 0xefffef50 > > 0xefffee5c: 0xc0024000 > > 0xefffee58: 0xc002a034 > > 0xefffee54: 0xc0024a90 > > 0xefffee50: 0xc0025878 > > 0xefffee4c: 0x00000001 > > 0xefffee48: 0x0017f020 > > 0xefffee44: 0x0000002c > > 0xefffee40: 0x0000000f > > 0xefffee3c: 0x00000000 > > 0xefffee38: 0xfffff7fa > > 0xefffee34: 0xffffffff > > 0xefffee30: 0x00009782 > > 0xefffee2c: 0x00000000 > > 0xefffee28: 0x0000001e > > 0xefffee24: 0xc0025858 > > 0xefffee20: 0xc0025af8 > > 0xefffee1c: 0xc000b376 > > 0xefffee18: 0xc0024000 > > 0xefffee14: 0xc0025878 > > 0xefffee10: 0x0000001d > > 0xefffee0c: 0xd0001b60 > > 0xefffee08: 0x0000002f > > 0xefffee04: 0xc002563e > > 0xefffee00: 0xc0025490 > > > >> The last address you show corresponds to 0xeffff640 in first dump > >> above, which is at the start of the saved fpregs. I'd say we just > >> miss the beginning of the signal frame? > >> > > > > It looks like you're right. I'm not sure how I missed that. > > > > So when the signal was delivered, PC == 0xc00e4178 and USP == > > 0xc00e4178. > > USP is 0xeffff004 AFAICS. That's the location 15 was saved to above > (holding libc .got according to your interpretation). > Right, it was a typo. USP is 0xeffff004, where a5 is to be saved. > The saved PC is that from the exception frame, in this case a long bus > error sequence fault frame. The PC is that of the instruction executing > when the fault occurred. As you say, that's the moveml saving registers > to the stack. > > I don't believe the whole fault frame is on the signal stack in one > contiguous piece, just the first four words, then we have struct > sigcontext. But after that, the extra contents follows, and that nicely > explains the extra bits right below the return address from the > __m68k_read_tp call. > > > Those addresses can be found in the disassembly and the stack contents > > I sent previously (quoted above) and it all seems to line up. > > > >> (My reasoning is that copy_siginfo_to_user clears the end of the > >> signal stack, which is what we can see in both cases.) > >> > >> Can't explain the 14 words below the saved return address though. > >> > > > > Right. Is it sc_fpstate? Perhaps we should expect QEMU to differ here. > > See above - I think what's stored there is the extra frame content for a > format b bus error frame. But that extra frame is incomplete at best > (should be 22 longwords, only a4 are seen). Probably overwritten by the > stack frame from __GI___wait4_time64. > Maybe the exception frame leaked onto the user stack via setup_frame()? > Let's parse what's left: > <= > >>> 0xefffefe4: 0xc0028780 <= internal registers (6x) > >>> 0xefffefe0: 0x3c344bfb <= > >>> 0xefffefdc: 0x000af353 <= > >>> 0xefffefd8: 0x3c340170 <= internal reg; version no. > >>> 0xefffefd4: 0x00000000 <= data input buffer > >>> 0xefffefd0: 0xc00e417c <= internal registers (2x) > >>> 0xefffefcc: 0xc00e417e <= stage b address > >>> 0xefffefc8: 0xc00e4180 <= internal registers (4x) > >>> 0xefffefc4: 0x48e73c34 <= > >>> 0xefffefc0: 0x00000000 <= data output buffer > >>> 0xefffefbc: 0xefffeff8 <= internal registers (2x) > >>> 0xefffefb8: 0xefffeffc <= data fault address > >>> 0xefffefb4: 0x4bfb0170 <= ins stage c, stage b > >>> 0xefffefb0: 0x0eee0709 <= internal register; ssw > > The fault address is the location on the stack where a2 is saved. That > does match the data output buffer contents BTW. fc, fb, rc, rb bits > clear means the fault didn't occur in stage b or c instructions. ssw bit > 8 set indicates a data fault - the data cycle should be rerun on rte. rm > and rw bits clear tell us it's a write fault. If the moveml instruction > copies registers to the stack in descending order, the fault address > makes sense - the stack pointer just crossed a page boundary. > Well spotted! > > > > Bottom line is, the corrupted %a3 register would have been saved by > > the MOVEM instruction at 0xc00e4178, which turns out to be the PC in > > the signal frame. So it certainly looks like the kernel was the > > culprit here. > > I think the moveml instruction did cause a bus error, and on return from > that exception the signal got delivered. > Maybe the signal frame was partially overwritten by the resumed MOVEM? I wonder what we'd see if we patched the kernel to log every user data write fault caused by a MOVEM instruction. I'll try to code that up. > On entering the buserror handler, only a1 and a2 are saved, but the > comment in entry.h states that a3-a6 and d6, d7 are preserved by C code. > After buserr_c returns, a3 should be restored to what it was when taking > the bus error. All registers restored before rte, the moveml instruction > ought to be able to resume normally. > > Unless that register use constraint has changed, I don't see how a3 > could have changed midway during return from the bus error exception. > But maybe a disassembly of buserr_c from your kernel could confirm that? > I disassembled the relevant build. AFAICT, buserr_c() saves and restores those registers in the right places. BTW, I've reproduced the failures with kernels built with both GCC 12 and GCC 6.