On Tuesday 24 November 2009 09:39:52 Christian Gaul wrote: > Kern Sibbald schrieb: > > Hello, > > > > It appears that TLS is getting stuck indefinitely in a read because of > > some networking error. > > > > You might try applying the attached patch. There is a good chance that > > it will break the SD out of this condition. > > > > Apply the patch with: > > > > cd <bacula-source> > > patch -p2 <3.0.3-tls-stall.patch > > ./configure <your-options> > > make > > ... > > make install > > > > Feedback would be appreciated. > > > > Regards, > > > > Kern > > > > On Thursday 19 November 2009 10:07:00 Christian Gaul wrote: > >> Amongst many other clients, i backup my workstation using bacula (in > >> this case 3.0.3, but i've been seeing this since i started using bacula > >> with version 2.2 something). > >> > >> I can see the job for my client in the director, it is in the status > >> "Waiting for client XXX to connect to storage YYY", and it has been in > >> that status since i turned it off (around 13 hours ago). I am unable to > >> cancel the job, because it is not running or scheduled and none of the > >> other jobs on the director were able to start, they are all "waiting for > >> execution" and older jobs have been canceled (thanks for fixing the > >> canceled email notification with 3.0.3 btw) which means that, on this > >> director, i have not had nightly backups run on any of my clients, on > >> any of my SDs because a single client got turned off inbetween the > >> director initializing the job and the client making the connection to > >> the SD. > >> > >> I've been seeing this behavior, as i said, for a really long time now, > >> and it has caused me enough grief to set up a second director / SDs and > >> even two FDs per client. A single client, lets say a broken one, one > >> being turned off or a malicious one, can bring a whole director to a > >> halt. Is there some magic timeout value that is set to a (useless) > >> default value that i am missing, or is it rather non concurrent > >> connection creation that is blocking all my other jobs? > >> > >> I can leave the director in this state for a couple hours to perform > >> magic incantations (stacktrace, backtrace etc) if you want any > >> information about this issue. > >> > >> Ill attach the btraceback right away, also the last log lines.. but > >> since i am not running this director for testing, it isnt running under > >> any debug levels. > >> > >> After reviewing the bconsole output to make it postable, it seems that > >> some jobs did run after 18:03 (the time i turned off my workstation), > >> the last job ran (to a different SD than the one that blocked) at 02:30, > >> after that, no new jobs, even to different SDs, could start. > >> > >> I really appreciate the work you guys are doing on bacula and i would > >> love it if someone would take a look at this. > > I've applied the patch to the SD where the problem occurred, since it's > just a SD patch and doesn't change anything much, i don't think i'll > have to exchange all SD versions. > I will keep an eye on it, but since this only happens randomly i can not > promise anything much (except if it explodes or gets worse).
Fortunately, the patch making things worse is very unlikely, it is more likely not to help much, but we cannot be 100% sure until it is tested ... > > Thanks for your time > > --------------------------------------------------------------------------- >--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day trial. Simplify your report design, integration and deployment - and > focus on what you do best, core application coding. Discover what's new > with Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Bacula-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/bacula-devel ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
