Thank's
problem solved


2010/1/12 Kern Sibbald <k...@sibbald.com>

> On Tuesday 12 January 2010 12:26:37 Carlo Filippetto wrote:
> > Have someone found any other definitive solution?
> > In this moment I had to disable the mail
>
> The failure does not occur if you have a valid SMTP server defined.
>
> >
> > Thank's
> >
> >
> >
> > 2010/1/9 Renaud Marquet <rmarq...@gmail.com>
> >
> > > Le samedi 09 janvier 2010 à 21:25 +0100, Kern Sibbald a écrit :
> > > > Hello,
> > > >
> > > > On Saturday 09 January 2010 20:20:01 Renaud Marquet wrote:
> > > > > Kern,
> > > > >
> > > > > altough I searched for a possible workaround, I didn't found the
> ones
> > > > > you talk about. But your statement is not correct as pointing to a
> > >
> > > valid
> > >
> > > > > smtp server is not a proper workaround. Actually, if for some
> reason,
> > > > > the *valid* smtp server is down, the problem will occur and I bet
> > > > > users will not figure out the reason.
> > > >
> > > > I never claimed that my suggestion was a "proper" workaround nor that
> > > > it
> > >
> > > was a
> > >
> > > > fix.  It is a workaround.
> > >
> > > Nevermind then ;)
> > >
> > > > If you want, you can backport the fixes (applied 23 October 2009),
> but
> > >
> > > since
> > >
> > > > we are close to release, and we have a workaround, we are not
> planning
> > > > to backport them.
> > >
> > > No need to backport. This is not a 'blocker' problem, I just mailed
> here
> > > in case someone else run into the same problem because there wasn't any
> > > answer when googling. Bacula now runs perfectly fine on my system, so I
> > > can wait for the upcoming release without any trouble.
> > >
> > > > > That's why I came up with this patch. It correctly fixes the
> problem
> > >
> > > but
> > >
> > > > > I recognize this could affect performances so it should certainly
> not
> > >
> > > be
> > >
> > > > > put in the trunk. It will even probably be useless as you pointed
> out
> > > > > it's already fixed in developpement version.
> > > >
> > > > Unfortunately your patch does not fix the problem -- it masks the
> > >
> > > problem.  I
> > >
> > > > didn't look at your patch in detail, but I believe that it will make
> > > > all locks recursive, which is not really what we want and may lead to
> > > > some surprises.
> > > >
> > > > Bacula does have recursive locks, but we use them only in situations
> > >
> > > where
> > >
> > > > they need to be used and they are portable.  I am not so much worried
> > >
> > > about
> > >
> > > > the performance consequences of your patch, but your code is Linux
> only
> > >
> > > if I
> > >
> > > > am not mistaken (i.e. not portable), and as I said, the lock manager
> is
> > >
> > > not
> > >
> > > > production code.  It is development should only be turned on for
> > >
> > > developer's
> > >
> > > > for debugging.
> > >
> > > As I said in another mail, I didn't do anything to activate this lock
> > > manager, so I guess it's not. I think the confusion come from the fact
> > > mutexes are handled through some functions in lockmgr.c (through a
> > > macro), I think even with lock manager deactivated.
> > >
> > > > > That said, I didn't know lock manager should be turned off in
> > >
> > > production
> > >
> > > > > environment. Moreover, I'm not sure I understand your point
> because,
> > > > > although I didn't read all the code, it seems pretty strange to me
> > > > > that a multithreaded application should not use any mutexes in a
> > > > > production environment.
> > > >
> > > > We use mutexes in production as in development.  The lock manager
> > >
> > > "watches"
> > >
> > > > our lock usage and blows up Bacula if it detects a problem (deadlock,
> > > > out
> > >
> > > of
> > >
> > > > order locks, ...).  It is a debug tool and not meant or sufficently
> > >
> > > tested
> > >
> > > > for production use.  Use it at your own risk.
> > > >
> > > > That said, you were very clever to figure out the problem. Not many
> > > > users could do so.
> > >
> > > Thank you,
> > > Regards.
> > >
> > > > Regards,
> > > >
> > > > Kern
> > > >
> > > > > Regards,
> > > > > Renaud
> > > > >
> > > > > Le samedi 09 janvier 2010 à 00:03 +0100, Kern Sibbald a écrit :
> > > > > > Hello Arno and Renaud,
> > > > > >
> > > > > > I can believe that there might be a bug in the lock manager
> > > > > > software,
> > >
> > > but
> > >
> > > > > > I am very surprised that it is turned on. It should only be
> turned
> > > > > > on
> > >
> > > for
> > >
> > > > > > developers, and thus though this patch may be correct (I don't
> > > > > > think
> > >
> > > so,
> > >
> > > > > > but Eric can answer more definitively), it should never be needed
> > > > > > in
> > >
> > > a
> > >
> > > > > > production system, and won't work in a production system because
> of
> > >
> > > the
> > >
> > > > > > lock manager being turned off.
> > > > > >
> > > > > > Can you explain why the lock manager code is turned on?
> > > > > >
> > > > > > If this is a problem with a misconfigured mail daemon, then it is
> > >
> > > very
> > >
> > > > > > likely that this problem has already shown up and has a very
> > >
> > > different
> > >
> > > > > > solution. The problem I just mentioned is fixed in the current
> > > > > > development version, and the workaround for version 3.0.x is to
> > >
> > > ensure
> > >
> > > > > > that either email is turned off or you point to a valid smtp
> > > > > > server.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Kern
> > > > > >
> > > > > > On Friday 08 January 2010 21:32:18 Arno Lehmann wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > this is just forwarding your mail to bacula-devel, where it's
> > > > > > > more likely to be picked up, looked at, and perhaps integrated
> > > > > > > into the code base :-)
> > > > > > >
> > > > > > > Cheers, and thanks for not only analyzing the problem, but also
> > > > > > > providing a possible fix!
> > > > > > >
> > > > > > > Arno
> > > > > > >
> > > > > > > 07.01.2010 16:34, Renaud Marquet wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm using bacula 3.0.3 and the director's job queue was stuck
> > >
> > > after
> > >
> > > > > > > > running the first job. The others were waiting indefinitely
> for
> > > > > > > > execution. If the director was restarted, I could run only
> one
> > >
> > > job,
> > >
> > > > > > > > and so on.
> > > > > > > >
> > > > > > > > Googling around I found these 2 posts without satisfying
> > > > > > > > anwsers
> > >
> > > http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-maili
> > >
> > > > > > > >ng-l
> > >
> > > ists-3/bacula-25/upgrade-to-3-0-3-job-is-waiting-for-execution-102156
> > >
> > > > > > > >/
> > >
> > > http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-maili
> > >
> > > > > > > >ng-l ists-3/bacula-25/job-is-waiting-for-execuition-101508/
> > > > > > > >
> > > > > > > > I then looked at the code and found there is a deadlock
> > > > > > > > happening
> > >
> > > in
> > >
> > > > > > > > message handling.
> > > > > > > >
> > > > > > > > The problem is located in close_msg(JCR *) function in
> > > > > > > > message.c. When it encounters an error while sending an
> e-mail,
> > > > > > > > it calls the macro Jmsg1 (line 485) to report it. This macro
> > > > > > > > calls
> > > > > > > > dispatch_message, which tries to acquire fides_mutex (line
> > > > > > > > 738). Unfortunatly, this mutex was already acquired in
> > > > > > > > close_msg (line 431), thus resulting in a deadlock (as stated
> > > > > > > > in mutex
> > >
> > > documentation
> > >
> > > > > > > > for PTHREAD_MUTEX_INITIALIZER kind).
> > > > > > > >
> > > > > > > > This problem was affecting me because mail daemon was not
> > >
> > > properly
> > >
> > > > > > > > configured on my server.
> > > > > > > >
> > > > > > > > It could be interesting to review these parts of the code to
> > >
> > > avoid
> > >
> > > > > > > > such situation.
> > > > > > > >
> > > > > > > > However I wrote a quick patch for lockmgr.c which simply
> > > > > > > > upgrades mutexes to PTHREAD_MUTEX_ERRORCHECK_NP kind and
> > > > > > > > resolves this
> > >
> > > error.
> > >
> > > > > > > > Hope this would help someone,
> > > > > > > > Renaud
> > > > > > > >
> > > > > > > > patch :
> > > > > > > >
> > > > > > > > diff -rupN bacula-3.0.3.vanilla/src/lib/lockmgr.c
> > > > > > > > bacula-3.0.3.patched/src/lib/lockmgr.c
> > > > > > > > --- bacula-3.0.3.vanilla/src/lib/lockmgr.c    2009-10-18
> > > > > > > > 11:10:16.000000000 +0200
> > > > > > > > +++ bacula-3.0.3.patched/src/lib/lockmgr.c    2009-12-31
> > > > > > > > 18:05:59.000000000 +0100
> > > > > > > > @@ -616,6 +616,15 @@ void lmgr_cleanup_main()
> > > > > > > >   */
> > > > > > > >  int lmgr_mutex_lock(pthread_mutex_t *m, const char *file,
> int
> > >
> > > line)
> > >
> > > > > > > >  {
> > > > > > > > +   /* Patch to avoid deadlock if mutex is locked more than
> > > > > > > > once
> > >
> > > */
> > >
> > > > > > > > +   /* There's some performance hit which makes it probably
> not
> > > > > > > > acceptable */
> > > > > > > > +   /* for large system usage. */
> > > > > > > > +   if(*m == PTHREAD_MUTEX_INITIALIZER) {
> > > > > > > > +      pthread_mutexattr_t attr;
> > > > > > > > +      pthread_mutexattr_settype( &attr,
> > >
> > > PTHREAD_MUTEX_ERRORCHECK_NP
> > >
> > > > > > > > ); +      pthread_mutex_init( m, &attr );
> > > > > > > > +   }
> > > > > > > > +
> > > > > > > >     int ret;
> > > > > > > >     lmgr_thread_t *self = lmgr_get_thread_info();
> > > > > > > >     self->pre_P(m, file, line);
> > >
> > > ---------------------------------------------------------------------
> > >
> > > > > > > >---- ----- This SF.Net email is sponsored by the Verizon
> > > > > > > > Developer Community Take advantage of Verizon's best-in-class
> > > > > > > > app
> > >
> > > development
> > >
> > > > > > > > support A streamlined, 14 day to market process makes app
> > > > > > > > distribution fast and easy Join now and get one step closer
> to
> > > > > > > > millions of Verizon customers
> > >
> > > http://p.sf.net/sfu/verizon-dev2dev
> > >
> > > > > > > > _______________________________________________
> > > > > > > > Bacula-users mailing list
> > > > > > > > Bacula-users@lists.sourceforge.net
> > > > > > > > https://lists.sourceforge.net/lists/listinfo/bacula-users
> > >
> > >
> -------------------------------------------------------------------------
> > >
> > > > > >----- This SF.Net email is sponsored by the Verizon Developer
> > >
> > > Community
> > >
> > > > > > Take advantage of Verizon's best-in-class app development support
> A
> > > > > > streamlined, 14 day to market process makes app distribution fast
> > > > > > and easy Join now and get one step closer to millions of Verizon
> > >
> > > customers
> > >
> > > > > > http://p.sf.net/sfu/verizon-dev2dev
> > > > > > _______________________________________________
> > > > > > Bacula-users mailing list
> > > > > > Bacula-users@lists.sourceforge.net
> > > > > > https://lists.sourceforge.net/lists/listinfo/bacula-users
> > >
> > >
> -------------------------------------------------------------------------
> > >----- This SF.Net email is sponsored by the Verizon Developer Community
> > > Take advantage of Verizon's best-in-class app development support A
> > > streamlined, 14 day to market process makes app distribution fast and
> > > easy
> > > Join now and get one step closer to millions of Verizon customers
> > > http://p.sf.net/sfu/verizon-dev2dev
> > > _______________________________________________
> > > Bacula-users mailing list
> > > Bacula-users@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>
>
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to