On 7/10/2010 10:13 PM, Adam Leventhal wrote:
There's enough of those appearing that I suspect it's increasing dtrace's probe
effect. I'm tempted to just throw a -q at it, that would only mask the
symptoms. Is there something else I should do to prevent the errors from
occurring at all?
It's hard to say why you might be hitting that. You could investigate by doing
something like this:
ERROR
{
stop();
printf("stopped %d due to an error", pid);
exit();
}
Then you can use gcore<pid> to grab a core dump and prun<pid> to set the
process running again. From that core we should be able to figure out why the stack
backtrace failed.
OK, I did that (took a looong time -- 5GB dump). I don't see anything
out of the ordinary. Also tried attaching directly with dbx and pstack
-- no errors. I added a printout of the offending tid in hopes of
narrowing things down, but still nothing jumps out -- random tid and
function every time.
Nearly always the offending address is 0x0, but I did (once) manage to
get 0x100000; the output of pstack never shows any sign of the bad
address, though. It always goes all the way down to _lwp_start.
Disassembling functions showed nothing useful either -- the errors seem
to come after any instruction (even those which do not reference memory,
like "rd %pc, %o7").
Is there something specific I should look for?
Thanks!
Ryan
_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org