On Thu, Dec 12, 2024 at 11:30 AM Martin Pieuchot <m...@grenadille.net> wrote: ... > That sounds like a memory corruption of some sort. It might be that > recent changes hide it. I'd be glad if you could test George's change. ... > Thanks. If you run into such crash again, please try to get a trace > from the cpu that panic'd. In this case cpu0.
I'm finally back home and power cycled to get the machine back. After compiling a new kernel with source from cvs (including George's change) I pretty quickly got another crash: OpenBSD/powerpc64 (t.n2vi.net) (console) login: pdaanri c0d:axr 8k e0drxsni4es7rl d0dsixai4gs0nr0o0 0s0t0xi004c0 s0ts0re0ar0tp0i o n t r a "pa n o n d -a t> yap ner _ 3l 0o0 c0xk s8 r drs 1i 9 s 0 0 r0 0 0 =0 x=4 0 0 0 t0 y000p 0e 9 0 033 02 0 00 sr 0 0rat 1 9 0 0 0 0 t 0 r 0 a 0 0 0 p 0 t 0 9 0 3 2y a p t e 3 0 0a d d 3 9 8 l r b 8 d fbN08Ue8cLc 0Ls r r1|Stopped at _rb_remove+0x36c: ld r4,8(r3) TID PID UID PRFLAGS PFLAGS CPU COMMAND *453813 7858 8889 0x2000002 0 3 compile 36553 81731 8889 0x2000002 0 2 compile 99640 86457 8889 0x2000002 0 7 compile 242134 64462 8889 0x2000002 0 0 compile 68242 72226 8889 0x2000002 0x4000000 4 go 495275 4590 8889 0x2000002 0x4000000 1 compile 250285 66963 0 0x14000 0x200 6 reaper 241654 98481 0 0x14000 0x200 5 pagedaemon _rb_remove+0x36c uvm_pmr_get1page+0x130 uvm_pmr_getpages+0x474 uvm_pglistalloc+0x11c km_alloc+0x364 pool_page_alloc+0x64 pool_p_alloc+0x94 pool_do_get+0x298 pool_get+0xcc --db_more-- q amap_alloc1+0x120 amap_alloc+0x4c amap_copy+0x3fc uvm_fault_check+0x2cc uvm_fault+0x118 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{3}> show panic *cpu1: kernel diagnostic assertion "anon->an_lock == NULL || rw_write_held(anon->an_lock)" failed: file "/sys/uvm/uvm_anon.c", line 85 ddb{3}> mach ddbcpu 1 Stopped at cpu_intr+0x50: ori r0,r0,0x0 cpu_intr+0x50 xive_hvi+0x1b8 hvi_intr+0x38 trap+0xd4 trapagain+0x4 --- trap (type 0xea0) --- opal_call+0x50 opal_cnputc+0x8c cnputc+0x64 db_putchar+0x3b0 kputchar+0x1fc kprintf+0xd18 db_printf+0x78 panic+0xb8 __assert+0x30 ddb{1}> show registers r0 0x75c444 xive_hvi+0x1bc r1 0xc0000001601da158 r2 0x1054000 .TOC. r3 0x1 r4 0 r5 0x80000000 r6 0 r7 0x31c60060 r8 0 r9 0x31c60060 r10 0x31c60060 r11 0x75c444 xive_hvi+0x1bc r12 0xae8524 cpu_intr r13 0x4af56a908 r14 0 r15 0x3b r16 0x30 r17 0 r18 0 r19 0xc0000001601da8d0 r20 0 r21 0xffffffffffffff81 r22 0x1 r23 0xc00000003e4c8700 r24 0xc00000013acd9000 r25 0xc00000003e3e1080 r26 0xc00000003e3e1060 r27 0 r28 0x10b2c70 cpu_info+0xf08 r29 0xc00000003e3e1000 r30 0x1 r31 0x9000000000009032 lr 0xae8574 cpu_intr+0x50 cr 0x40009032 xer 0x20040000 ctr 0xae8524 cpu_intr iar 0xae8574 cpu_intr+0x50 msr 0x9000000000029032 dar 0xc001e65f80 dsisr 0x42000000 cpu_intr+0x50: ori r0,r0,0x0 ddb{1}> show proc PROC (compile) tid=495275 pid=4590 tcnt=8 stat=onproc flags process=2000002 proc=4000000 runpri=82, usrpri=82, slppri=32, nice=20 wchan=0x0, wmesg=, ps_single=0x0 scnt=0 ecnt=0 forw=0xffffffffffffffff, list=0xc00000003db20020,0xc00000003db55c68 process=0xc00000014c479048 user=0xc0000001601d6000, vmspace=0xc000000144b82178 estcpu=32, cpticks=1, pctcpu=0.0, user=1, sys=0, intr=0 ddb{1}> show all locks No such command ddb{1}> show witness No such command ddb{1}> show locks No such command ddb{1}> show ? Bad character all bcstats breaks buf extents malloc map mbuf mount nfsreq nfsnode object page panic pool proc registers route socket struct swap tdb uvmexp vnode watches ddb{1}> show all ? Bad character procs callout clockintr pools mounts vnodes bufs routes nfsreqs nfsnodes tdbs ddb{1}> mach ddbcpu 3 Stopped at _rb_remove+0x36c: ld r4,8(r3) _rb_remove+0x36c uvm_pmr_get1page+0x130 uvm_pmr_getpages+0x474 uvm_pglistalloc+0x11c km_alloc+0x364 pool_page_alloc+0x64 pool_p_alloc+0x94 pool_do_get+0x298 pool_get+0xcc amap_alloc1+0x120 amap_alloc+0x4c amap_copy+0x3fc uvm_fault_check+0x2cc uvm_fault+0x118 ddb{3}> show all locks No such command ddb{3}> show bcstats Current Buffer Cache status: numbufs 129161 busymapped 0, delwri 1845 kvaslots 52428 avail kva slots 52428 bufpages 931802, dmapages 931802, dirtypages 14752 pendingreads 41, pendingwrites 21 highflips 0, highflops 0, dmaflips 0 ddb{3}> show page PAGE 0xf70882: flags=72696340, vers=1949199922, wire_count=1986604654, pa=0x7379732f61726368 uobject=0x3620555443203230, uanon=0x2030323a31383a35, offset=0x32340a2020202065 [page ownership tracking disabled] vm_page_md 0xf708ea ddb{3}> show uvmexp Current UVM status: pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 7890068 VM pages: 220193 active, 623328 inactive, 1 wired, 6023076 free (726057 zero) freemin=263002, free-target=350669, inactive-target=350670, wired-max=2630022 faults=646752174, traps=823614522, intrs=225630053, ctxswitch=257158930 fpuswitch=0 softint=84868373, syscalls=493843057, kmapent=31 fault counts: noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0 ok relocks(total)=3874988(3937104), anget(retries)=202347794(0), amapcopy=44517770 neighbor anon/obj pg=14281629/248150553, gets(lock/unlock)=87584800/3937305 cases: anon=191542501, anoncow=10805293, obj=75240848, prcopy=12281635, przero=356881834 daemon and swap counts: woke=40, revs=0, scans=0, obscans=0, anscans=0 busy=0, freed=0, reactivate=0, deactivate=0 pageouts=0, pending=0, nswget=0 nswapdev=1 swpages=8454143, swpginuse=0, swpgonly=0 paging=0 kernel pointers: objs(kern)=0x106cd70 ddb{3}> show registers r0 0xb8df08 uvm_pmr_get1page+0x134 r1 0xc00000013e7841d0 r2 0x1054000 .TOC. r3 0 r4 0xc000000014684e98 r5 0xc000000014684ea0 r6 0xc0000000148c5608 r7 0 r8 0 r9 0x9000000000001032 r10 0x1032900000000000 r11 0xb8df08 uvm_pmr_get1page+0x134 r12 0 r13 0x4af56bb08 r14 0 r15 0xffffffffffffffff r16 0 r17 0xffffffffffffffff r18 0 r19 0xc00000013e7845c0 r20 0 r21 0xc000000014683c00 r22 0xfd4ce0 uvm_pmr_addr_RBT_INFO r23 0xc000000014683c10 r24 0x1 r25 0 r26 0xc000000014683a10 r27 0xc000000014684490 r28 0xc000000014683c10 r29 0xc0000000000a1000 r30 0xfd4ce0 uvm_pmr_addr_RBT_INFO r31 0 lr 0xb8df08 uvm_pmr_get1page+0x134 cr 0x22222032 xer 0x20040000 ctr 0xb9bb0c generic_space_write_1 iar 0xadd398 _rb_remove+0x36c msr 0x9000000000009032 dar 0x8 dsisr 0x40000000 _rb_remove+0x36c: ld r4,8(r3) (I tried a few commands that I'd seen in past emails that seemed relevant but I gather are now obsolete. I'll leave the machine at that point in ddb for a while in case there is something else I should print.)