Lars Eggert wrote:
> Terry Lambert wrote:
> > Debug:
>  >
> [excellent kernel-debugging recipe snipped]
> 
> Here's a backtrace of a crashdump that should be more helpful:

[ ... ]

> (kgdb) up 12
> #12 0xc02098a4 in namei (ndp=0x9e) at /usr/src/sys/kern/vfs_lookup.c:158
> 158             FILEDESC_LOCK(fdp);
> (kgdb) list
> 153     #endif
> 154
> 155             /*
> 156              * Get starting point for the translation.
> 157              */
> 158             FILEDESC_LOCK(fdp);
> 159             ndp->ni_rootdir = fdp->fd_rdir;
> 160             ndp->ni_topdir = fdp->fd_jdir;
> 161
> 162             dp = fdp->fd_cdir;
> 
> (kgdb) print ndp
> $2 = (struct nameidata *) 0x9e
> 
> (kgdb) print fdp
> $1 = (struct filedesc *) 0x34
> (kgdb)
> 
> (kgdb) print p
> $3 = (struct proc *) 0x0
> 
> (kgdb) print td
> $5 = (struct thread *) 0xc662d1e0
> 
> (kgdb) print *td
> $7 = {td_proc = 0xc66307f0,
> [...]
> 
> Very strange. namei() does essentially the following:
> 
>         p = td->td_proc;
>         fdp = p->p_fd;
> 
> td->td_proc seems reasonable, but p is 0. No idea how this could happen,
> any guesses?

Cool.

This is not where I was guessing it was at, at all.  8-) 8-).

There's a commit that Alfred made last Friday night that might
have something to do with it.  It was an attempt to fix a lock
order reversal between "PROC/filedesc", according to the commit,
and it introduced "fdesc_mtx".

If you grep for that everywhere, and then annotate the involved
files, it should be pretty obvious which changes to revert to see
if this is the case (1.50->1.49 of /sys/sys/filedesc.h, etc.).

It may also be an issue with some of the recent KSE commits
over the last weekend missing an assignment on a context switch.

Probably the easiest thing to do, if you can repeat the problem
reliably, is to bsearch, starting 8 days days ago, for the commit
that broke the camel's back.

It's really tempting to make a script that's capable of carrying
out a /usr/src/sys bsearch semi-automatically, because people are
really hesistant to use this approach for solving problems, even
though it only requires O(log2(N)) reboots to find it...


-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to