George, thank you for the suggestion of changing membar_enter and
membar_consumer
from isync to sync. I did that and the frequency of crashes went way
down, admittedly on
a workload that is not solidly reproducible. But last night there was
finally another crash (see below)
so that's not the full solution. I'll keep trying to read the code;
obviously nothing wrong so far
to my naive eye.

Martin, to respond to your question about pool corruption: yes, there
seems to be some
corruption or exhaustion of the pmap or pted pools but I don't see
evidence yet that it happens
in the same place or way each time.



panic: kernel diagnostic assertion "UVM_PSEG_INUSE(pseg, id)" failed: file "/sy
s/uvm/uvm_pager.c", line 227
Stopped at      panic+0x134:    ori r0,r0,0x0

    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
 210089  19233   8889  0x18000001          0    0  go.test
 395921  69071   8889  0x1a000003          0    6  compile
 118941  35800   8889  0x1a000003          0    3  compile

 281308  32582   8889  0x1a000003          0    1  compile

*370626   8557   8889  0x1a000003  0x4000000    2  link
     72  61744   8889  0x1a000003          0    4  compile
 243588  29484   8889  0x1a000003          0    7  go.test
 103299  39952   8889  0x1a000003          0    5  go
panic+0x134

__assert+0x30
uvm_pseg_release+0x380

uvn_io+0x2d4

uvn_get+0x1dc

uvm_fault_lower+0x22c
uvm_fault+0x200

trap+0x4a8

trapagain+0x4
--- trap (type 0x300) ---

           End of kernel: 0xc0000273c8 lr 0x118168

           https://www.openbsd.org/ddb.html describes the minimum info
required in bug
           reports.  Insufficient info makes it difficult to find and fix bugs.

ddb{2}> mach ddbcpu 0

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50

xive_hvi+0x1b8

hvi_intr+0x38

trap+0xd4
trapagain+0x4

--- trap (type 0xea0) ---
_kernel_lock+0xe0

xive_hvi+0x1a0

hvi_intr+0x38

trap+0xd4

trapagain+0x4

--- trap (type 0xea0) ---
uvm_pmr_addr_RBT_COMPARE+0x28
uvm_pmr_pnaddr+0x70

uvm_pmr_insert_addr+0x78

uvm_pmr_remove_1strange+0x39c
ddb{0}> mach ddbcpu 1

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0

cpu_intr+0x50
xive_hvi+0x1b8
hvi_intr+0x38

trap+0xd4

trapagain+0x4

--- trap (type 0xea0) ---

mtx_enter+0x5c

uvm_pmr_getpages+0x2a8
uvm_pglistalloc+0x11c

km_alloc+0x364

pool_page_alloc+0x64
pool_p_alloc+0x94

pool_do_get+0x298

pool_get+0xcc

pmap_enter+0x1ac

ddb{1}> mach ddbcpu 3

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50

xive_hvi+0x1b8
hvi_intr+0x38

trap+0xd4

trapagain+0x4

--- trap (type 0xea0) ---
mtx_enter+0x5c

uvm_pmr_freepageq+0xf0

uvm_pglistfree+0x28

km_alloc+0x3b8
pool_page_alloc+0x64

pool_p_alloc+0x94
pool_do_get+0x298

pool_get+0xcc

pmap_enter+0x1ac

ddb{3}> mach ddbcpu 4

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50
xive_hvi+0x1b8

hvi_intr+0x38

trap+0xd4

trapagain+0x4
--- trap (type 0xea0) ---

mtx_enter+0x5c

uvm_wait+0xbc
uvm_fault_lower+0x94c

uvm_fault+0x200

trap+0x270

trapagain+0x4

--- trap (type 0x400) ---
End of kernel: 0xc000037df8 lr 0x8c3420

ddb{4}> mach ddbcpu 5

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50
xive_hvi+0x1b8

hvi_intr+0x38

trap+0xd4
trapagain+0x4

--- trap (type 0xea0) ---

uvm_pmr_addr_RBT_COMPARE+0x28

uvm_pmr_pnaddr+0x70
uvm_pmr_insert_addr+0x78

uvm_pmr_remove_1strange+0x39c

uvm_pmr_freepageq+0x150
uvm_pglistfree+0x28

km_alloc+0x3b8

pool_page_alloc+0x64

pool_p_alloc+0x94

ddb{5}> mach ddbcpu 6

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50

xive_hvi+0x1b8

hvi_intr+0x38

trap+0xd4
trapagain+0x4

--- trap (type 0xea0) ---
mtx_enter+0x54

uvm_wait+0xbc

uvm_fault_lower+0x94c

uvm_fault+0x200

trap+0x270

trapagain+0x4
--- trap (type 0x400) ---

End of kernel: 0xc000037df8 lr 0x363d90
ddb{6}> mach ddbcpu 7

Stopped at      cpu_intr+0x50:  ori r0,r0,0x0
cpu_intr+0x50

xive_hvi+0x1b8

hvi_intr+0x38

trap+0xd4

trapagain+0x4

--- trap (type 0xea0) ---
mtx_enter+0x5c

uvm_pmr_getpages+0x2a8

uvm_pglistalloc+0x11c

km_alloc+0x364

pool_page_alloc+0x64

pool_p_alloc+0x94
pool_do_get+0x298

pool_get+0xcc

pmap_enter+0x1ac
ddb{7}>

Reply via email to