On Mon, Nov 18, 2024 at 06:26:30PM -0800, Andrew Hewus Fresh wrote:
> On Tue, Nov 19, 2024 at 11:59:50AM +1000, David Gwynne wrote:
> > On Mon, Nov 18, 2024 at 03:26:21PM -0800, Andrew Hewus Fresh wrote:
> > > On Sun, Nov 17, 2024 at 04:16:17PM +1000, David Gwynne wrote:
> > > > On Sat, Nov 16, 2024 at 07:36:37PM -0800, Andrew Hewus Fresh wrote:
> > > > > I finally got around to fixing my alpha, which involved replacing the
> > > > > disk.  That meant I have to scp some stuff over to to it and after a 
> > > > > bit
> > > > > of time it panics:
> > > > > 
> > > > > panic: mtx 0xfffffe000002a628: locking against myself
> > > > > Stopped at      db_enter+0x8:   lda     sp,10(sp)
> > > > >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > > > > *204699  79774      0     0x14000      0x200    0  softnet0
> > > > > db_enter(0, 7ffffe00e0003f8, 1, 8, 3, 8) at db_enter+0x8
> > > > > panic(?, fffffe000002a628, 1b0, 10, a, 1) at panic+0xe8
> > > > > mtx_enter(?, ?, 1b0, 10, a, 1) at mtx_enter+0xb4
> > > > > ifq_set_oactive(?, ?, 1b0, 10, a, 1) at ifq_set_oactive+0x50
> > > > 
> > > > this is from src/sys/net/ifq.c r1.50 where i added a counter for the
> > > > number of times oactive gets set. because there's checks and multiple
> > > > things being tweaked i used the ifq mutex to serialise the updates.
> > > > 
> > > > de(4) uses ifq_deq_begin to try and shove an mbuf onto the hardware,
> > > > which takes but doesnt release the ifq mutex until ifq_deq_commit or
> > > > ifq_deq_rollback is called. so while it's holding the mutex is calls
> > > > ifq_set_oactive, which also tries to take the mutex.
> > > > 
> > > > i honestly don't understand what de(4) is doing with the hardware and
> > > > packet setup, so i dont feel confident changing the driver to avoid
> > > > this. the least worst alternative i could think of is to provide an
> > > > alternative set_oactive it can call.
> > > > 
> > > > the diff below should fix this.
> > > 
> > > I mean, it "fixed" the locking against myself error :-)
> > 
> > nice work getting a kernel build going.
> 
> Thanks, handily I have serial console and the install kernel lets me
> `ftp` things onto the disk.
> 
> 
> > > de0 at pci0 dev 11 function 0 "DEC 21040" rev 0x23, DEC 21040 pass 2.3: 
> > > isa irq 5, address ...
> > > panic: mutex 0xfffffe000002a628 not held in ifq_deq_set_oactive
> > > Stopped at      db_enter+0x8:   lda     sp,10(sp)
> > >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > > *     0      0      0     0x10000      0x200    0  swapper
> > > db_enter(0, 7ffffe00e0003f8, 1, 8, 3, fffffc0000000008) at db_enter+0x8
> > > panic(?, fffffe000002a628, fffffc0000bc9718, 1, 5, 1) at panic+0xe8
> > > ifq_deq_set_oactive(?, ?, fffffc0000bc9718, 1, 5, 1) at 
> > > ifq_deq_set_oactive+0x8
> > > 8
> > > tulip_txput(?, ?, fffffe000002a000, ?, 5, 1) at tulip_txput+0x5e8
> > 
> > hrm.
> > 
> > this might be better. or it might not even compile.
> 
> We'll find out eventually.  Thank you!

Good news!  The second patch seems to be working much better.  I got it
compiled, it booted up, I copied a bunch of files over ssh and no panics
so far!

Reply via email to