On 12/21/22 07:08, David Hill wrote:
On 12/21/22 05:33, Martin Pieuchot wrote:
On 18/12/22(Sun) 20:55, Martin Pieuchot wrote:
On 17/12/22(Sat) 14:15, David Hill wrote:
On 10/28/22 03:46, Renato Aguiar wrote:
Use of bbolt Go library causes 7.2 to freeze. I suspect it is
triggering some
sort of deadlock in mmap because threads get stuck at vmmaplk.
I managed to reproduce it consistently in a laptop with 4 cores
(i5-1135G7)
using one unit test from bbolt:
$ doas pkg_add git go
$ git clone https://github.com/etcd-io/bbolt.git
$ cd bbolt
$ git checkout v1.3.6
$ go test -v -run TestSimulate_10000op_10p
The test never ends and this is the 'top' report:
PID TID PRI NICE SIZE RES STATE WAIT TIME
CPU COMMAND
32181 438138 -18 0 57M 13M idle uvn_fls 0:00 0.00%
bbolt.test
32181 331169 10 0 57M 13M sleep/1 nanoslp 0:00 0.00%
bbolt.test
32181 497390 10 0 57M 13M idle vmmaplk 0:00 0.00%
bbolt.test
32181 380477 14 0 57M 13M idle vmmaplk 0:00 0.00%
bbolt.test
32181 336950 14 0 57M 13M idle vmmaplk 0:00 0.00%
bbolt.test
32181 491043 14 0 57M 13M idle vmmaplk 0:00 0.00%
bbolt.test
32181 347071 2 0 57M 13M idle kqread 0:00 0.00%
bbolt.test
After this, most commands just hang. For example, running a 'ps |
grep foo' in
another shell would do it.
I can reproduce this on MP, but not SP. Here is /trace from ddb
after using
the ddb.trigger sysctl. Is there any other information I could pull
from
DDB that may help?
Thanks for the useful report David!
The issue seems to be a deadlock between the `vmmaplk' and a particular
`vmobjlock'. uvm_map_clean() calls uvn_flush() which sleeps with the
`vmmaplk' held.
I'll think a bit about this and try to come up with a fix ASAP.
I'm missing a piece of information. All the threads in your report seem
to want a read version of the `vmmaplk' so they should not block. Could
you reproduce the hang with a WITNESS kernel and print 'show all locks'
in addition to all the informations you've reported?
Sure. Its always the same; 2 processes (sysctl and bbolt.test) and 3
locks (sysctllk, kernel_lock, and vmmaplk) with bbolt.test always on the
uvn_flsh thread.
Process 98301 (sysctl) thread 0xfff......
exclusive rwlock sysctllk r = 0 (0xfffff...)
exclusive kernel_lock &kernel_lock r = 0 (0xffffff......)
Process 32181 (bbolt.test) thread (0xffffff...) (438138)
shared rwlock vmmaplk r = 0 (0xfffff......)
To reproduce, just do:
$ doas pkg_add git go
$ git clone https://github.com/etcd-io/bbolt.git
$ cd bbolt
$ git checkout v1.3.6
$ go test -v -run TestSimulate_10000op_10p
The test will hang happen almost instantly.
Not sure if this is a hint..
https://github.com/etcd-io/bbolt/blob/master/db.go#L27-L31
// IgnoreNoSync specifies whether the NoSync field of a DB is ignored when
// syncing changes to a file. This is required as some operating systems,
// such as OpenBSD, do not have a unified buffer cache (UBC) and writes
// must be synchronized using the msync(2) syscall.
const IgnoreNoSync = runtime.GOOS == "openbsd"