On Sun, Jan 07, 2007 at 11:49:56AM +0000, Robert Watson wrote:
> On Sat, 6 Jan 2007, Ceri Davies wrote:
> 
> >>>So far it's happened this morning and yesterday morning.  I haven't seen 
> >>>it before that.  I don't know the cause so I can't reproduce it at will, 
> >>>but the logs don't give any indication.  Chances are that it will happen 
> >>>again tomorrow, but we'll see.
> >>
> >>Hmm.  It looks like you printf *(td->td_proc->p_fd->fd_ofiles) without 
> >>the array index.  Could you repeat that, but with the array index -- 
> >>i.e., td->td_proc->p_fd->fd_ofiles[uap->fd]?  Also, it would probably be 
> >>useful to print uap->fd.  Right now you're printing stdin (index 0), but 
> >>if the index is non-0, we want a different file.
> >
> >Very tactfully put :)  Sorry about that.
> >
> >None of the uap->fd's seem to be valid. In the first case, uap->fd is way 
> >too high for the length of fd_ofiles, which only has 21 elements:
> >
> >(kgdb) up 8
> >#8  0xc04c470d in fstat (td=0xc2eeb180, uap=0xd610dc74) at 
> >/usr/src/sys/kern/kern_descrip.c:1075
> >1075            error = kern_fstat(td, uap->fd, &ub);
> >(kgdb) p uap->fd
> >$1 = 89
> >(kgdb) p *td->td_proc->p_fd->fd_ofiles[uap->fd]
> >Cannot access memory at address 0x0
> >
> >In the second, uap->fd is nonsense:
> >
> >(kgdb) up 8
> >#8  0xc04c470d in fstat (td=0xc3109300, uap=0xd617ec74) at 
> >/usr/src/sys/kern/kern_descrip.c:1075
> >1075            error = kern_fstat(td, uap->fd, &ub);
> >(kgdb) p uap->fd
> >$1 = -1023449232
> >(kgdb)
> 
> Hmm.  So, I reviewed audit_arg_file() closely, and after staring at the 
> code a lot, couldn't see anything obvious in either the socket or the 
> vnode/fifo case.  I did fix one other bug there, however, which can never 
> actually be exercised in 7-CURRENT, and is fairly unlikely in 6-STABLE, and 
> will MFC that in a week.

OK, thanks.

> Could you try printing *td->td_ar?  Maybe this will give us a clue as to 
> how far it got.  In particular, this may be able to more reliably give us 
> the file descriptor number, which is audited early in the system call.  You 
> might find that 'td' is corrupted in many layers of the stack, keep going 
> up until you find one where it's good.  It may well be that 
> td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is 
> correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or 
> ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us some 
> more about the file descriptor even though it appears to have vanished.

*td->td_ar is null (0x0) in both cases...

> I'm quite worried by the fact that the file descriptor seems not to be 
> present any more -- this suggests a file descriptor related race of the 
> sort that is both quite difficult to figure out and also quite a risk.  
> It's strange that it would only trigger with audit, however--perhaps audit 
> stretches out the race.  Is this an SMP box?

It's certainly looking quite nasty.  This system is UP hardware without
options SMP.

> Could you print the entire contents of *td->td_proc->p_fd?

First case:

(kgdb) p *td->td_proc->p_fd
$2 = {fd_ofiles = 0xc3441000, fd_ofileflags = 0xc3441100 "", fd_cdir = 
0xc367f110, 
  fd_rdir = 0xc2ce2bb0, fd_jdir = 0x0, fd_nfiles = 64, fd_map = 0xc3b65970, 
fd_lastfile = 20, 
  fd_freefile = 16, fd_cmask = 63, fd_refcnt = 1, fd_holdcnt = 1, fd_mtx = 
{mtx_object = {
      lo_class = 0xc06ad4c4, lo_name = 0xc067c0fd "filedesc structure", 
      lo_type = 0xc067c0fd "filedesc structure", lo_flags = 196608, lo_list = 
{tqe_next = 0x0, 
        tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, 
fd_locked = 0, 
  fd_wanted = 0, fd_kqlist = {slh_first = 0x0}, fd_holdleaderscount = 0, 
fd_holdleaderswakeup = 0}

Second case:

(kgdb) p *td->td_proc->p_fd
$2 = {fd_ofiles = 0xc2d23600, fd_ofileflags = 0xc2d23700 "", fd_cdir = 
0xc31b8660, 
  fd_rdir = 0xc2ce2bb0, fd_jdir = 0x0, fd_nfiles = 64, fd_map = 0xc2e9c1c0, 
fd_lastfile = 20, 
  fd_freefile = 17, fd_cmask = 63, fd_refcnt = 1, fd_holdcnt = 1, fd_mtx = 
{mtx_object = {
      lo_class = 0xc06ad4c4, lo_name = 0xc067c0fd "filedesc structure", 
      lo_type = 0xc067c0fd "filedesc structure", lo_flags = 196608, lo_list = 
{tqe_next = 0x0, 
        tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, 
fd_locked = 0, 
  fd_wanted = 0, fd_kqlist = {slh_first = 0x0}, fd_holdleaderscount = 0, 
fd_holdleaderswakeup = 0}

If it's at all useful, I can provide access to this system and the
dumps.

Ceri
-- 
That must be wonderful!  I don't understand it at all.
                                                  -- Moliere

Attachment: pgpT6fmVvPA4c.pgp
Description: PGP signature

Reply via email to