On Friday 02 February 2007 00:38, [EMAIL PROTECTED] wrote:
> I don't have a non-Solaris box to compare right now so I'm not sure
> why it only showed up only on Solaris.  It occurred identically on a
> dual CPU sparc, single CPU sparc, and a single CPU AMD64 box.  I did
> notice right after the "delete jcrs", the memory was immediately
> filled with a 10 bit pattern.  That's why it died trying to
> dereference 0xaaaaaaaa.  Maybe other systems don't do that so the
> memory contents at the pointer was still intact even though it had
> been freed.

Hmmm. There must be some subtle timing difference between Solaris and Linux.  
In any case, this was only reported on Solaris, but is a general bug that 
occurred on all systems.  the 0xaaaaaa is part of smartalloc and happens on 
all systems with the purpose to catch these kinds of problems.  Maybe I 
should change it to 0xeeeeee as it seems that Linux doesn't trap the 
problem ...

Also, even more interesting is that after making the change, a number of 
memory leaks that have been in Bacula for almost 2 years started showing up 
in the smartalloc report.  The only thing I can imagine is that this bug 
somehow masked those problems (i.e. prevented the final printout).

So thanks for figuring it out.

Best regards,

Kern

>
>                                       -Jason
>
> Kern Sibbald writes:
>  > On Thursday 01 February 2007 15:17, [EMAIL PROTECTED] wrote:
>  > > I traced down that crash during director shutdown.  In
>  > > terminate_dird(), term_msg() is freeing up a structure referenced in
>  > > the watchdog queue, so it crashes in stop_watchdog().
>  > >
>  > > I moved stop_watchdog() in front of term_msg() but am not familiar
>  > > enough with this code to know if that's a good idea.
>  >
>  > Yes, that is fine.  In the current code, I have moved it to just after
>  > the already_here = true;
>  >
>  > > Would there be
>  > > any unwanted side effects from changing this order?
>  >
>  > No.
>  >
>  > It seems to me that someone had this problem on Solaris, but since he
>  > never good a good dump and I could not reproduce it, this bug has
>  > remained, so thanks for finding it.
>  >
>  > I'm curious why it triggered on your system.
>  > Are you running on a multi-processor system?
>  >
>  > >                                  -Jason
>  > >
>  > >
>  > > ----------------------------------------------------------------------
>  > >--- Using Tomcat but need to do more? Need to support web services,
>  > > security? Get stuff done quickly with pre-integrated technology to
>  > > make your job easier. Download IBM WebSphere Application Server
>  > > v.1.0.1 based on Apache Geronimo
>  > > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1216
>  > >42 _______________________________________________
>  > > Bacula-users mailing list
>  > > Bacula-users@lists.sourceforge.net
>  > > https://lists.sourceforge.net/lists/listinfo/bacula-users
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to