i've got a panic that appears to be a race in ilock.
but perhaps i'm missing something and it's actually
a h/w problem.

in this situation there are 128 kernel procs that all
increment the same counter with some code
that looks like so:

void
incref(void)
{
        ilock(&somelock);
        someval++;
        iunlock(&somelock);
}

(i realize there are probablly better ways to do this.)
there is a similar function to decrease the value.
other than this, there are no references to somelock.

what i'm seeing is a panic with someval = 5. (gathered
from the fact that someval is stored immediately after
the somelock and is dumped with dumplockmem())
and the panic message:

corrupt ilock &somelock pc=&incref m=0 isilock=1

since the value of isilock is exactly 1, it's hard to imagine
this is a bad memory stick or a wild ptr.

i can't see how this could be unless on very first
reference of the lock there is a race with the looser
evaluateing !l->isilock before that processor can see
the winner setting l->isilock to 1.

if this diagnosis is correct, what is the proper fix?
ken does lock and unlock each lock he uses them in the fs
code before using them.  is this this required for kernel
ilocks, too?

- erik

Reply via email to