(Damien, I'm putting you in Cc because there's a diff below I'd like your opinion on.)
On Fri, Feb 27, 2009 at 07:29:45AM +0000, Stefan Sperling wrote: > On Thu, Feb 26, 2009 at 05:46:33PM +0000, Stefan Sperling wrote: > > On Sat, Feb 21, 2009 at 10:54:57AM +0000, Stefan Sperling wrote: > > > I got more panics, both while using the laptop (in X, so no trace), > > > and today I got another one at boot time (totally different than > > > any other one I had before, and also not reproducable, I couldn't > > > take pics of the trace but it was in the uvm layer, during an execve > > > system call). > > > > So I've compiled a kernel with debugging symbols and made > > it core dump upon panics. > > > > I've seen several panics now that are all very similar, always > > crashing while locking vnodes, but not exclusively during execve, > > I've also seen one during a write(2) to a socket, for example. > > > > The panics happen both with GENERIC and GENERIC.MP. > > I don't know how to trigger them but they tend to happen > > about 2 or 3 times per day. > > > > So far they only happened when I had the cardbus card inserted: > > ral0 at cardbus0 dev 0 function 0 "Ralink RT2560" rev 0x01: irq 268505099, > > address 00:0e:2e:5c:55:4f > > ral0: MAC/BBP RT2560 (rev 0x04), RF RT2525 > > > > Below is a trace from one of the crashes. > > Any hints are appreciated. > > Here's another crash that hopefully shows more relation to the > cardbus card, so I changed the subject again... > > I could not get a dump for this because the disk was not responding, > (I guess that was because interrupts were masked?) so this is copied > from a piece of paper, and the trace has some omissions (still no > serial on this thing). > > uvm_fault(0xd080eb20, 0x1200f00, 0, 1) -> e > > fatal page fault (6) in supervisor mode > trap type 6 code 0 eip d03cdda2 cs 50 eflags 10297 cr2 1200ffff cpl 60 > panic: trap type 6, code=0, pc=d03cdda2 > > --- trap (number 6) --- > ieee80211_tree_RB_MINMAX(d1d64030, 1200ffff, dc251d40, 0) at > ieee80211_tree_RB_MINMAX+0x5a > ieee82011_find_rxnode(d1d64030, 1200ffff, 0, 800, 0) at > ieee82011_find_rxnode+0x1a > rt2560_decryption_intr(d1d64000, dc278000, 20, ffffffff) at > rt2560_decryption_intr+0x197 > rt2560_intr(d1d64000, 4, 0, 1) at rt2560_intr+0x71 > pccbintr_function(d1bdf600) at pccbintr_function+0x71 > Xintr_ioapic2() at Xintr_ioapic2+0x74 > > The network the card was connected to is using WEP. > > For dmesg etc. see previous mails in this thread. I am now testing this diff: Index: if_ral_cardbus.c =================================================================== RCS file: /usr/cvs/src/sys/dev/cardbus/if_ral_cardbus.c,v retrieving revision 1.12 diff -u -p -r1.12 if_ral_cardbus.c --- if_ral_cardbus.c 25 Nov 2008 22:20:11 -0000 1.12 +++ if_ral_cardbus.c 27 Feb 2009 09:48:44 -0000 @@ -230,7 +230,7 @@ ral_cardbus_enable(struct rt2560_softc * ral_cardbus_setup(csc); /* map and establish the interrupt handler */ - csc->sc_ih = cardbus_intr_establish(cc, cf, csc->sc_intrline, IPL_NET, + csc->sc_ih = cardbus_intr_establish(cc, cf, csc->sc_intrline, IPL_VM, csc->sc_opns->intr, sc, sc->sc_dev.dv_xname); if (csc->sc_ih == NULL) { printf("%s: could not establish interrupt at %d\n", Some (all?) ieee80211 devices may end up calling malloc in ISR context, in ieee80211_node_alloc(). If I am reading spl(9) correctly, use of malloc needs to be protected with splvm(), not splnet(). Let's see how it goes. Stefan