On 7/10/2010 10:13 PM, Adam Leventhal wrote:
There's enough of those appearing that I suspect it's increasing dtrace's probe 
effect. I'm tempted to just throw a -q at it, that would only mask the 
symptoms. Is there something else I should do to prevent the errors from 
occurring at all?
It's hard to say why you might be hitting that. You could investigate by doing 
something like this:

ERROR
{
        stop();
        printf("stopped %d due to an error", pid);
        exit();
}

Then you can use gcore<pid>  to grab a core dump and prun<pid>  to set the 
process running again. From that core we should be able to figure out why the stack 
backtrace failed.
OK, I did that (took a looong time -- 5GB dump). I don't see anything out of the ordinary. Also tried attaching directly with dbx and pstack -- no errors. I added a printout of the offending tid in hopes of narrowing things down, but still nothing jumps out -- random tid and function every time.

Nearly always the offending address is 0x0, but I did (once) manage to get 0x100000; the output of pstack never shows any sign of the bad address, though. It always goes all the way down to _lwp_start.

Disassembling functions showed nothing useful either -- the errors seem to come after any instruction (even those which do not reference memory, like "rd %pc, %o7").

Is there something specific I should look for?

Thanks!
Ryan

_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Reply via email to