On Friday 02 February 2007 00:38, [EMAIL PROTECTED] wrote: > I don't have a non-Solaris box to compare right now so I'm not sure > why it only showed up only on Solaris. It occurred identically on a > dual CPU sparc, single CPU sparc, and a single CPU AMD64 box. I did > notice right after the "delete jcrs", the memory was immediately > filled with a 10 bit pattern. That's why it died trying to > dereference 0xaaaaaaaa. Maybe other systems don't do that so the > memory contents at the pointer was still intact even though it had > been freed.
Hmmm. There must be some subtle timing difference between Solaris and Linux. In any case, this was only reported on Solaris, but is a general bug that occurred on all systems. the 0xaaaaaa is part of smartalloc and happens on all systems with the purpose to catch these kinds of problems. Maybe I should change it to 0xeeeeee as it seems that Linux doesn't trap the problem ... Also, even more interesting is that after making the change, a number of memory leaks that have been in Bacula for almost 2 years started showing up in the smartalloc report. The only thing I can imagine is that this bug somehow masked those problems (i.e. prevented the final printout). So thanks for figuring it out. Best regards, Kern > > -Jason > > Kern Sibbald writes: > > On Thursday 01 February 2007 15:17, [EMAIL PROTECTED] wrote: > > > I traced down that crash during director shutdown. In > > > terminate_dird(), term_msg() is freeing up a structure referenced in > > > the watchdog queue, so it crashes in stop_watchdog(). > > > > > > I moved stop_watchdog() in front of term_msg() but am not familiar > > > enough with this code to know if that's a good idea. > > > > Yes, that is fine. In the current code, I have moved it to just after > > the already_here = true; > > > > > Would there be > > > any unwanted side effects from changing this order? > > > > No. > > > > It seems to me that someone had this problem on Solaris, but since he > > never good a good dump and I could not reproduce it, this bug has > > remained, so thanks for finding it. > > > > I'm curious why it triggered on your system. > > Are you running on a multi-processor system? > > > > > -Jason > > > > > > > > > ---------------------------------------------------------------------- > > >--- Using Tomcat but need to do more? Need to support web services, > > > security? Get stuff done quickly with pre-integrated technology to > > > make your job easier. Download IBM WebSphere Application Server > > > v.1.0.1 based on Apache Geronimo > > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1216 > > >42 _______________________________________________ > > > Bacula-users mailing list > > > Bacula-users@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/bacula-users > > ------------------------------------------------------------------------- > Using Tomcat but need to do more? Need to support web services, security? > Get stuff done quickly with pre-integrated technology to make your job > easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users