[cc's trimmed]
John,
Thanks for the suggestion, I appreciate it. I did as you suggested
(diff below).
It paniced again, but this time savecore said "dump time is unreasonable."
The short panic message was:
panicstr: bremfree: bp 0xcc2a1ae4 not locked
Looks like the same thing to me, sorry. Any other suggestions?
magpire/sys/kern;diff subr_witness.c subr_witness.c-dist
392a393
> mtx_lock(&all_mtx);
395d395
< mtx_lock(&all_mtx);
magpire/sys/kern;diff -c subr_witness.c subr_witness.c-dist
*** subr_witness.c Thu Aug 16 16:16:06 2001
--- subr_witness.c-dist Thu Aug 16 16:15:20 2001
***************
*** 390,398 ****
mtx_unlock_spin(&w_mtx);
}
lock_cur_cnt--;
STAILQ_REMOVE(&all_locks, lock, lock_object, lo_list);
- mtx_lock(&all_mtx);
lock->lo_flags &= ~LO_INITIALIZED;
mtx_unlock(&all_mtx);
}
--- 390,398 ----
mtx_unlock_spin(&w_mtx);
}
+ mtx_lock(&all_mtx);
lock_cur_cnt--;
STAILQ_REMOVE(&all_locks, lock, lock_object, lo_list);
lock->lo_flags &= ~LO_INITIALIZED;
mtx_unlock(&all_mtx);
}
magpire/sys/kern;
On Wed, Aug 15, 2001 at 04:42:21PM -0700, John Baldwin wrote:
>
> On 15-Aug-01 Michael Lucas wrote:
> > On Wed, Aug 15, 2001 at 10:21:39AM +0930, Greg Lehey wrote:
> >> To help localize this problem, could you please try this same thing on
> >> a kernel without devfs? The dump you sent me did not look like a
> >> Vinum bug, as I said in my reply.
> >
> > Sorry, it happens on a non-devfs kernel as well. Since it doesn't
> > appear to be a Vinum bug, I'm taking the liberty of sending the whole
> > thing to -current. (I sent my first dump to Greg in particular, since
> > a Vinum command triggered whatever this is.)
> >
>
> > Script started on Wed Aug 15 17:57:48 2001
> > magpire/var/crash;file /boot/kernel/vinum.ko
> > /boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1
> > (FreeBSD), not stripped
> > magpire/var/crash;file kernel.debug.nodevfs
> > kernel.debug.nodevfs: ELF 32-bit LSB executable, Intel 80386, version 1
> > (FreeBSD), dynamically linked (uses shared libs), not stripped
> > magpire/var/crash;gdb -k kernel.debug.nodevfs vmcore.3
> > GNU gdb 4.18
> > Copyright 1998 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB. Type "show warranty" for details.
> > This GDB was configured as "i386-unknown-freebsd"...
> > IdlePTD 4284416
> > initial pcb at 34b860
> > panicstr: bremfree: bp 0xcc2a1ae4 not locked
>
> Unfortunately this is the panic message from later on during the syncing disks
> stage, not the real panic. :(
>
> >#15 0xc01f0783 in witness_destroy (lock=0xc1ec4e68) at
> >#../../../kern/subr_witness.c:395
>
> This is the real problem:
>
> mtx_lock(&all_mtx);
> lock_cur_cnt--;
> STAILQ_REMOVE(&all_locks, lock, lock_object, lo_list);
> lock->lo_flags &= ~LO_INITIALIZED;
> mtx_unlock(&all_mtx);
>
> It panics in the STAILQ_REMOVE(). I've seen this a couple of times but have no
> idea how that list pointer is getting corrupted. My guess is that a mutex is
> being destroyed twice or something dumb like that; however, I'm not sure how.
> The LO_INITIALIZED flags and checks are supposed to catch that case. I suppose
> there is a chance we could preempt in between the LO_INITIALIZED check and the
> actual removal and then free it and get in trouble that way. Hmm. Try moving
> the mtx_lock of &all_mtx before the check for LO_INITIALIZED and see if you can
> get a different panic. It may be a bug in the ucred stuff. (At least several
> other panics of this type have been the result of crfree's.)
>
> --
>
> John Baldwin <[EMAIL PROTECTED]> -- http://www.FreeBSD.org/~jhb/
> PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
> "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/
--
Michael Lucas
[EMAIL PROTECTED]
http://www.blackhelicopters.org/~mwlucas/
Big Scary Daemons: http://www.oreillynet.com/pub/q/Big_Scary_Daemons
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message