On Tue, Apr 23, 2024 at 02:48:32PM +0200, Martin Pieuchot wrote: > On 22/04/24(Mon) 16:18, Mark Kettenis wrote: > > > Date: Mon, 22 Apr 2024 15:39:55 +0200 > > > From: Alexander Bluhm <bl...@openbsd.org> > > > > > > Hi, > > > > > > I see a witness lock order reversal warning with soreceive. It > > > happens during NFS regress tests. In /var/log/messages is more > > > context from regress. > > > > > > Apr 22 03:18:08 ot29 /bsd: uid 0 on > > > /mnt/regress-ffs/fstest_49fd035b8230791792326afb0604868b: out of inodes > > > Apr 22 03:18:21 ot29 mountd[6781]: Bad exports list line > > > /mnt/regress-nfs-server > > > Apr 22 03:19:08 ot29 /bsd: witness: lock order reversal: > > > Apr 22 03:19:08 ot29 /bsd: 1st 0xfffffd85c8ae12a8 vmmaplk (&map->lock) > > > Apr 22 03:19:08 ot29 /bsd: 2nd 0xffff80004c488c78 nfsnode (&np->n_lock) > > > Apr 22 03:19:08 ot29 /bsd: lock order data w2 -> w1 missing > > > Apr 22 03:19:08 ot29 /bsd: lock order "&map->lock"(rwlock) -> > > > "&np->n_lock"(rrwlock) first seen at: > > > Apr 22 03:19:08 ot29 /bsd: #0 rw_enter+0x6d > > > Apr 22 03:19:08 ot29 /bsd: #1 rrw_enter+0x5e > > > Apr 22 03:19:08 ot29 /bsd: #2 VOP_LOCK+0x5f > > > Apr 22 03:19:08 ot29 /bsd: #3 vn_lock+0xbc > > > Apr 22 03:19:08 ot29 /bsd: #4 vn_rdwr+0x83 > > > Apr 22 03:19:08 ot29 /bsd: #5 vndstrategy+0x2ca > > > Apr 22 03:19:08 ot29 /bsd: #6 physio+0x204 > > > Apr 22 03:19:08 ot29 /bsd: #7 spec_write+0x9e > > > Apr 22 03:19:08 ot29 /bsd: #8 VOP_WRITE+0x45 > > > Apr 22 03:19:08 ot29 /bsd: #9 vn_write+0x100 > > > Apr 22 03:19:08 ot29 /bsd: #10 dofilewritev+0x14e > > > Apr 22 03:19:08 ot29 /bsd: #11 sys_pwrite+0x60 > > > Apr 22 03:19:08 ot29 /bsd: #12 syscall+0x588 > > > Apr 22 03:19:08 ot29 /bsd: #13 Xsyscall+0x128 > > > > You're not talking about this one isn't it? > > This also seems to be in the correct order. vmmaplk before FS lock. > That's the order of physio(9) and uvm_fault(). > > > > Apr 22 03:19:08 ot29 /bsd: witness: lock order reversal: > > > Apr 22 03:19:08 ot29 /bsd: 1st 0xfffffd85c8ae12a8 vmmaplk (&map->lock) > > > Apr 22 03:19:08 ot29 /bsd: 2nd 0xffff80002ec41860 sbufrcv > > > (&so->so_rcv.sb_lock) > > > Apr 22 03:19:08 ot29 /bsd: lock order "&so->so_rcv.sb_lock"(rwlock) -> > > > "&map->lock"(rwlock) first seen at: > > > Apr 22 03:19:08 ot29 /bsd: #0 rw_enter_read+0x50 > > > Apr 22 03:19:08 ot29 /bsd: #1 uvmfault_lookup+0x8a > > > Apr 22 03:19:08 ot29 /bsd: #2 uvm_fault_check+0x36 > > > Apr 22 03:19:08 ot29 /bsd: #3 uvm_fault+0xfb > > > Apr 22 03:19:08 ot29 /bsd: #4 kpageflttrap+0x158 > > > Apr 22 03:19:08 ot29 /bsd: #5 kerntrap+0x94 > > > Apr 22 03:19:08 ot29 /bsd: #6 alltraps_kern_meltdown+0x7b > > > Apr 22 03:19:08 ot29 /bsd: #7 copyout+0x57 > > > Apr 22 03:19:08 ot29 /bsd: #8 soreceive+0x99a > > > Apr 22 03:19:08 ot29 /bsd: #9 recvit+0x1fd > > > Apr 22 03:19:08 ot29 /bsd: #10 sys_recvfrom+0xa4 > > > Apr 22 03:19:08 ot29 /bsd: #11 syscall+0x588 > > > Apr 22 03:19:08 ot29 /bsd: #12 Xsyscall+0x128 > > > Apr 22 03:19:08 ot29 /bsd: lock order data w1 -> w2 missing > > > > Unfortunately we don't see the backtrace for the reverse lock order. > > So it is hard to say something sensible. Without more information I'd > > say that taking "&so->so_rcv.sb_lock" before "&map->lock" is the > > correct lock order. > > I agree. Now I'd be very grateful if someone could dig into WITNESS to > figure out why we see such reports. Are these false positive or are we > missing data from the code path that we think are incorrect? >
Do we need witness(4) support for sblock()?