On Tue, Jul 10, 2001 at 11:43:29PM -0700, Matt Dillon wrote:
>
> :Hi,
> :
> :I've just tripped over an obviously long-standing (since about
> :Jan. 1998) bug in vinvalbuf while looking into PR kern/26224. The
> :problematic code looks like (on -CURRENT):
> :
> : /*
> : * Destroy the copy in the VM cache, too.
> : */
> : mtx_lock(&vp->v_interlock);
> : if (VOP_GETVOBJECT(vp, &object) == 0) {
> : vm_object_page_remove(object, 0, 0,
> : (flags & V_SAVE) ? TRUE : FALSE);
> : }
> : mtx_unlock(&vp->v_interlock);
> :
> :The locks seem to be needed for file systems that don't perform real
> :locking (on -STABLE, they are simplelocks).
> :This, however, is incorrect because vm_object_page_remove may sleep.
> :I've attached a patch that passes the interlock to
> :vm_object_page_remove, which in turn passes it to a modified version
> :of vm_page_sleep, which unlocks it around the sleep.
> :I think that this is correct, because the object should be in a valid
> :state when we sleep (and all checks are reexecuted in that case).
> :
> :Since I'm not very experienced with vfs and vm stuff, I'd be glad if
> :this patch could get some review. In particular, is the lock/unlock
> :pair really still needed, and are there notable differeces in -STABLE
> :(because the fix would need the be MFC'ed)?
> :
> :Thanks,
> : - thomas
>
> Ok, I've looked at this more. What is supposed to happen is that
> the 'while (vp->v_numoutput) ...' code just above the section you
> cited is supposed to prevent the deadlock. The while code looks like
> this:
>
> while (vp->v_numoutput > 0) {
> vp->v_flag |= VBWAIT;
> tsleep(&vp->v_numoutput, PVM, "vnvlbv", 0);
> }
>
> However, as Ian points out in his followup, it doesn't work:
>
> :...
> :I've seen a related deadlock here in 4.x with NFS; vm_fault locks
> :a page and then calls vput which aquires the v_interlock. This code
> :in vinvalbuf locks the v_interlock first, and then vm_object_page_remove()
> :locks the page. That's a simple lock-order reversal deadlock which I
> :guess would still exist even with this patch.
>
> It doesn't work for the simple reason that vm_fault isn't busying the
> page for writing, it's busying it for reading, so v_numoutput will be
> 0.
>
> I think the best solution is to change vinvalbuf's wait loop to
> wait on vm_object->paging_in_progress instead of vp->v_numoutput,
> or perhaps to wait for both conditions to be satisfied.
>
> vm_object->paging_in_progress applies to reads and writes, while
> vp->v_numoutput only applies to writes.
>
> I know this isn't the most correct solution... there is still the
> lock reversal if vm_object_remove_pages() were ever to sleep. The
> idea is that it won't sleep if there is no paging in progress because
> nobody will have any of the object's pages locked. I think it is
> the best we can do for the moment. There are several places in
> vm/vm_object.c where the same assumption is made (testing against
> vm_object->paging_in_progress, though, which works, not vp->v_numoutput).
>
> -
>
> Thomas, if you are interested this could be a little project for you.
> See if you can code-up another while() loop to also wait for
> the object's paging_in_progress count to become 0 in the vinvalbuf()
> code. Look in vm/vm_object.c for examples of how to construct such
> a loop. I will be happy to review and test whatever you come up with
> and I'm sure Ian will too!
>
> -Matt
>
>
I recently mentioned on freebsd-stable in message
<[EMAIL PROTECTED]>
a recurring rslock panic which I believe has been caused by the below
mentioned problem in vinvalbuf(). I have worked up a patch for -STABLE
(which should also apply to -CURRENT if there have not been major changes
to this code). The patch is appended to this message for review.
Data from the crash dump revealed the following:
SMP 2 cpus
IdlePTD 3555328
initial pcb at 2cf280
panicstr: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x00000001
panic messages:
---
panic: rslock: cpu: 0, addr: 0xd7be66ec, lock: 0x00000001
mp_lock = 00000001; cpuid = 0; lapic.id = 01000000
boot() called on cpu#0
---
#0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:473
#1 0xc016cf8f in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:313
#2 0xc016d3a9 in panic (fmt=0xc025bcce "rslock: cpu: %d, addr: 0x%08x, lock: 0x%08x")
at /usr/src/sys/kern/kern_shutdown.c:581
#3 0xc025bcce in bsl1 ()
#4 0xc021eeac in _unlock_things (fs=0xd7a6dec4, dealloc=0)
at /usr/src/sys/vm/vm_fault.c:147
#5 0xc021f8c7 in vm_fault (map=0xd7a6bf40, vaddr=673968128, fault_type=1 '\001',
fault_flags=0) at /usr/src/sys/vm/vm_fault.c:826
#6 0xc025d016 in trap_pfault (frame=0xd7a6dfa8, usermode=1, eva=673972223)
at /usr/src/sys/i386/i386/trap.c:829
#7 0xc025ca8b in trap (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 99049,
tf_esi = 0, tf_ebp = -1077937884, tf_isp = -676929580, tf_ebx = 48110729,
tf_edx = 0, tf_ecx = 1835007, tf_eax = 672137216, tf_trapno = 12, tf_err = 4,
tf_eip = 134517190, tf_cs = 31, tf_eflags = 66070, tf_esp = -1077937940,
tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:359
#8 0x80491c6 in ?? ()
#9 0x8048d9e in ?? ()
#10 0x804a078 in ?? ()
#11 0x8048b45 in ?? ()
---
A quick review of processes revealed a process stuck in vmopar:
(kgdb) ps
...
46479 d7ffc560 d806e000 235886 1 46394 004006 3 tail vmopar c09120c8
...
which was sleeping in vm_object_page_remove() in vinvalbuf():
(kgdb) btp 46479
frame 0 at 0xd806fc8c: ebp d806fcb8, eip 0xc0170051 <tsleep+429>: mov
0x141(%ebx),%al
frame 1 at 0xd806fcb8: ebp d806fce0, eip 0xc02262e8 <vm_object_page_remove+268>:
add $0x10,%esp
frame 2 at 0xd806fce0: ebp d806fd2c, eip 0xc019a667 <vinvalbuf+803>: add
$0x10,%esp
frame 3 at 0xd806fd2c: ebp d806fd60, eip 0xc01d0aa0 <nfs_vinvalbuf+264>: add
$0x18,%esp
frame 4 at 0xd806fd60: ebp d806fe2c, eip 0xc01cf5d8 <nfs_bioread+540>: mov
%eax,0xffffff74(%ebp)
frame 5 at 0xd806fe2c: ebp d806fe44, eip 0xc01f6842 <nfs_read+30>: jmp
0xc01f6849 <nfs_read+37>
frame 6 at 0xd806fe44: ebp d806fe78, eip 0xc01a22cc <vn_read+280>: mov
%eax,0xffffffe8(%ebp)
frame 7 at 0xd806fe78: ebp d806fef4, eip 0xc017b690 <dofileread+176>: mov
%eax,%esi
frame 8 at 0xd806fef4: ebp d806ff28, eip 0xc017b556 <read+54>: mov %eax,%esi
frame 9 at 0xd806ff28: ebp d806ffa0, eip 0xc025d745 <syscall2+537>: mov
%eax,0xffffffb8(%ebp)
The patch is below, against vfs_subr.c 1.249.2.11 2001/09/11
--- vfs_subr.c Tue Sep 11 04:49:53 2001
+++ vfs_subr.c.new Wed Sep 26 15:23:09 2001
@@ -721,9 +721,9 @@
}
}
- while (vp->v_numoutput > 0) {
- vp->v_flag |= VBWAIT;
- tsleep(&vp->v_numoutput, PVM, "vnvlbv", 0);
+ if (VOP_GETVOBJECT(vp, &object) == 0) {
+ while (object->paging_in_progress)
+ vm_object_pip_sleep(object, "vnvlbv");
}
splx(s);
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message