On Mon, Apr 21, 2025 at 07:52:07AM +1000, matthew green wrote:
> > (gdb) print panicstr
> > $2 = 0xffffffff80f06020 <scratchstr> "cpu0: softints stuck for 16 seconds"
> 
> you'll need to find cpu0's back trace.  try feeding the core
> into crash(8) instead of GDB, it should be easier to find.

crash> bt
end() at 0
kern_reboot() at kern_reboot+0x93
vpanic() at vpanic+0x16b
panic() at vprintf
heartbeat() at heartbeat+0x1f2
hardclock() at hardclock+0x9c
Xresume_lapic_ltimer() at Xresume_lapic_ltimer+0x1e
--- interrupt ---
mutex_spin_exit() at mutex_spin_exit+0x5a
callout_softclock() at callout_softclock+0xad
 /usr/src/sys/arch/amd64/compile/QUARK/../../../../kern/kern_timeout.c:873 v1.79
softint_dispatch() at softint_dispatch+0x8f
 /usr/src/sys/arch/amd64/compile/QUARK/../../../../kern/kern_softint.c:605 v1.76
 (points to an SDT_PROBE ?! so handler already called and done?)
  
crash> machine cpu 0
No such command: cpu
crash> ps
PID     LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
2917 > 2917 7   0   8060000   ffff8052e4a14000                tar
0    >    5 7   0       200   ffff8055abee1c00          softclk/0
(rest idle)
crash> show proc /a ffff8052e4a14000
lwp_t ffff8052e4a14000
tar: pid 2917 proc ffff80526621c2c0 vmspace/map ffff8052f09750c0 flags 4000
> lwp 2917 ffff8052e4a14000 pcb ffffc404c234e000
    stat 7 flags 8060000 cpu 0 pri 25 ref 0
crash> show proc /a ffff8055abee1c00
lwp_t ffff8055abee1c00
system: pid 0 proc ffffffff80e47d00 vmspace/map ffffffff80efea20 flags 20002
> lwp 5 [softclk/0] ffff8055abee1c00 pcb ffffc404a4604000
    stat 7 flags 200 cpu 0 pri 220 ref 0
  lwp 4 [softbio/0] ffff8055abee1800 pcb ffffc404a45fe000
    stat 1 flags 200 cpu 0 pri 221 ref 0
  lwp 3 [softnet/0] ffff8055abee1400 pcb ffffc404a45f8000
    stat 1 flags 200 cpu 0 pri 222 ref 0
  lwp 2 [idle/0] ffff8055abee1000 pcb ffffc404a41a7000
    stat 1 flags 201 cpu 0 pri 0 ref 0
  lwp 0 [swapper] ffffffff80e47580 pcb ffffffff81230000
    stat 2 flags 240 cpu 0 pri 125 ref 0
crash> x/i ffffc404c234e000,5
ffffc404c234e000:       addb    %al,0 (%rax)
ffffc404c234e002:       addb    %al,0 (%rax)
ffffc404c234e004:       xorl    0 (%rax),%eax
ffffc404c234e006:       addl    $0x352ff080,%eax
ffffc404c234e00b:       ret     $0xc404

I don't see how to find out what tar was up to. firefox was running
at the same time, but only tar and softclk were running when the heartbeat
fired (set to default 15s).


> FWIW, i upgraded my main desktop to sources from about 70
> hours ago, shortly after 70 hours ago, no problems here yet,
> and i've done a full local pkg rebuild and quite a lot of
> other random (mostly network) IO.  *touches wood*.

My pbulk box is happy. This is the laptop with the "fast" nvme "disk", as
opposed to the pbulk box's "slow" Samsung SSD 850 EVO M.2.


Cheers,

Patrick

Reply via email to