John Baldwin wrote: > On 22-Feb-01 Maxim Sobolev wrote: > > John Baldwin wrote: > > > >> On 22-Feb-01 Maxim Sobolev wrote: > >> > >> >> > Here it is (from DDB): > >> >> > panic(c027de93,c0297409,c027f878,368,80286) > >> >> > _mtx_assert(c02ea000,9,c027f878,368,80286) > >> >> > mi_switch(c32c5da0,3,c02cea44,c357be98) > >> >> > ithread_schedule(c0747c00,1) > >> >> > sched_ithd(e) > >> >> > Xresume14() > >> >> > --- interrupt, eip = 0xc025b60f, esp = 0x80296, ebp = 0xc357bf08 --- > >> >> > trap(18, 10, 10,c01597b6,20) > >> >> > calltrap() > >> >> > --- trap 0x9, eip = 0xc025a5de, esp = 0xc357bf50, ebp = 0xc357bf64 --- > >> >> > sw1b(c0146cbc,c0146cbc,c32c5da0,c357bf94) > >> >> > ithread_loop(c0747c00,c357bfa8) > >> >> > fork_exit(c0146cbc,c0747c00,c357bfa8) > >> >> > fork_trampoline() > >> >> > >> >> *sigh* This is why enabling interrupts in trap() is such a bad idea. If > >> >> we > >> >> get a trap in the scheduler, then lots of bad crap starts to happen > >> >> because > >> >> we > >> >> can get an interrupt while we are in a trap. :( Can you compile your > >> >> kernel > >> >> with > >> >> INVARIANTS on though, as I think the kernel should've panic'd earlier if > >> >> it > >> >> is > >> >> doing what I think it is doing. > >> > > >> > It's already have INVARIANTS, MUTEX_DEBUG, WITNESS and WITNESS_DDB. > >> > >> Hmm, ouch, you do'nt want MUTEX_DEBUG, that'll slow your system to a crawl. > > > > It doesn't really matter, because system can't even boot into single-user due > > to > > panic. > > > >> >> Also, if you are feeling industrious, edit > >> >> sys/i386/i386/trap.c and comment out the enable_intr() call near the > >> >> beginning > >> >> of the trap() function right after the printf for 'kernel trap %d with > >> >> interrupts disabled'. > >> > > >> > Ok, I'll try so. > >> > > >> > -Maxim > >> > >> It will still panic, just hopefully a better panic. > > > > I did understand that, but the panic I see after the change is exactly the > > same as > > before. Any other ideas? > > A recursive sched_lock? Erm, well, stick these options in your kernel config: > > options KTR > options KTR_EXTEND > options KTR_COMPILE=KTR_LOCK > options KTR_MASK=KTR_MASK > > Then when it panics, use the 'show ktr' command to list the mutex operations up > until that point. Hopefully you can see where it is grabbing sched lock the > first time and then not releasing it. Ok, I'll do it and send results later. > Also, hsa the backtrace changed at all? > If not, then you may have commented out the wrong enable_intr(). :) Did what you have suggested. Please see attached diff. -Maxim
--- src/sys/i386/i386/trap.c 2001/02/22 16:20:12 1.1 +++ src/sys/i386/i386/trap.c 2001/02/22 16:20:58 @@ -264,7 +264,7 @@ * We should walk p_heldmtx here and see if any are * spin mutexes, and not do this if so. */ - enable_intr(); +/* enable_intr();*/ } }