Hello, I have now implemented what I hope to be a fix for the seg fault that the two of you encountered. I have posted the following file on my website in case you would like to test it:
www.sibbald.com/download/bacula-beta-1.38.4-06Jan06.tar.gz and www.sibbald.com/download/bacula-beta-1.38.4-06Jan06.tar.gz.sig Concerning the question of exactly when the clock starts ticking for the MaxRunTime, it is definitely the start time that is used. The problem is in the definition of start time. In general, the start time is a few seconds after the schedule time. Perhaps in a later version of Bacula, I will rethink about what the start time really is, because in fact a job can be held for lots of reasons before it actually starts backing up data. On Friday 06 January 2006 14:50, Evelyne Cangini wrote: > "it looks like this is due to a Maximum Run Time that you have set for the > Job" : In effect, Max Run Time was set to 3 hours. Backup was running from > a console command. > > I have another problem with that Job directive (without relation with the > previous) : Bacula's documentation writes : "The time specifies maximum > allowed time that a job may run, counted from the when the job starts (not > necessarily the same as when the job was scheduled)." > > It seems to me the time is counted from when the job was scheduled : > Backup scheduled : > Schedule : run at 1:05 > Max Start Dealy = 10800 > Max Run Time = 3600 > Le job canceled at 2:05 and it will never start. > > Kern Sibbald a écrit: > >Hello, > > > >Apparently, you have fallen into a bug that has previously been reported. > >From what I can tell from the traceback (nice, thanks), it looks like this > > is due to a Maximum Run Time (I forget the exact directive name) that you > > have set for the Job, and the watchdog decided the time had passed so it > > canceled the job. In doing so, it trips over itself and falls flat on > > its face :-( > > > >This is now my #1 priority. However, in the mean time, as a workaround > > either increase the timeout significantly or remove it althogether ... > > > >On Friday 06 January 2006 12:35, Evelyne Cangini wrote: > >>Hello, > >> > >>Several times, i try a backup wich buckle : each time, the job cancel > >>after running during 3 hours and with SD Bytes Written around 30 GB. > >>And the last time, the log file announce : Fatal Error because: Bacula > >>interrupted by signal 11: Segmentation violation. > >> > >>I receive also a mail "Bacula GDB traceback of bacula-dir" : > >> > >>Using host libthread_db library "/lib/libthread_db.so.1". > >>[Thread debugging using libthread_db enabled] > >>[New Thread -1209116992 (LWP 11829)] > >>[New Thread -1221706832 (LWP 11831)] > >>[New Thread -1211216976 (LWP 11830)] > >>0x00980402 in ?? () > >>$1 = "gaiaDir", '\0' <repeats 22 times> > >>$2 = 0x95d1868 "bacula-dir" > >>$3 = 0x95d1890 "/home/bacula/bin/bacula-dir" > >>$4 = "MySQL" > >>$5 = 0x80ca612 "1.38.2 (20 November 2005)" > >>$6 = 0x80ca600 "i686-pc-linux-gnu" > >>$7 = 0x80ca5f9 "redhat" > >>$8 = 0x80ca5f0 "(Stentz)" > >>#0 0x00980402 in ?? () > >>#1 0x004808f6 in __nanosleep_nocancel () from /lib/libpthread.so.0 > >>#2 0x080935db in bmicrosleep (sec=60, usec=0) at bsys.c:54 > >>#3 0x0806759a in wait_for_next_job (one_shot_job_to_run=0x0) at > >>scheduler.c:96 #4 0x0804d16f in main (argc=0, argv=0xbfb0cc14) at > >>dird.c:244 > >> > >>Thread 3 (Thread -1211216976 (LWP 11830)): > >>#0 0x00980402 in ?? () > >>#1 0x002f64b1 in ___newselect_nocancel () from /lib/libc.so.6 > >>#2 0x0809752c in bnet_thread_server (addrs=0x95d2770, max_clients=10, > >> client_wq=0x80dce80, > >> handle_client_request=0x807dbde <handle_UA_client_request>) > >> at bnet_server.c:148 > >>#3 0x0807d96e in connect_thread (arg=0x95d2770) at ua_server.c:73 > >>#4 0x0047bb80 in start_thread () from /lib/libpthread.so.0 > >>#5 0x002fddee in clone () from /lib/libc.so.6 > >> > >>Thread 2 (Thread -1221706832 (LWP 11831)): > >>#0 0x00980402 in ?? () > >>#1 0x00480fbb in __waitpid_nocancel () from /lib/libpthread.so.0 > >>#2 0x080a978d in signal_handler (sig=11) at signal.c:159 > >>#3 <signal handler called> > >>#4 0x0047caa2 in pthread_mutex_lock () from /lib/libpthread.so.0 > >>#5 0x080933d8 in _p (m=0xaaaaaaba) at bsys.c:370 > >>#6 0x0809d541 in JCR::inc_use_count (this=0xaaaaaaaa) at ../jcr.h:99 > >>#7 0x0809d1fd in get_next_jcr (prev_jcr=0xb43ce9c0) at jcr.c:581 > >>#8 0x0805e170 in job_monitor_watchdog (self=0x95e2b70) at job.c:443 > >>#9 0x080b18ac in watchdog_thread (arg=0x0) at watchdog.c:265 > >>#10 0x0047bb80 in start_thread () from /lib/libpthread.so.0 > >>#11 0x002fddee in clone () from /lib/libc.so.6 > >> > >>Thread 1 (Thread -1209116992 (LWP 11829)): > >>#0 0x00980402 in ?? () > >>#1 0x004808f6 in __nanosleep_nocancel () from /lib/libpthread.so.0 > >>#2 0x080935db in bmicrosleep (sec=60, usec=0) at bsys.c:54 > >>#3 0x0806759a in wait_for_next_job (one_shot_job_to_run=0x0) at > >>scheduler.c:96 #4 0x0804d16f in main (argc=0, argv=0xbfb0cc14) at > >>dird.c:244 > >>#0 0x00980402 in ?? () > >>No symbol table info available. > >>#1 0x004808f6 in __nanosleep_nocancel () from /lib/libpthread.so.0 > >>No symbol table info available. > >>#2 0x080935db in bmicrosleep (sec=60, usec=0) at bsys.c:54 > >>54 stat = nanosleep(&timeout, NULL); > >>Current language: auto; currently c++ > >>timeout = {tv_sec = 60, tv_nsec = 0} > >>tv = {tv_sec = 1, tv_usec = 20} > >>tz = {tz_minuteswest = 0, tz_dsttime = 0} > >>stat = 0 > >>#3 0x0806759a in wait_for_next_job (one_shot_job_to_run=0x0) at > >>scheduler.c:96 96 bmicrosleep(NEXT_CHECK_SECS, 0); /* recheck once > >>per minute */ jcr = (JCR *) 0xbfb0cb08 > >>job = (JOB *) 0x809cd17 > >>run = (RUN *) 0x987d8e8 > >>now = 0 > >>first = false > >>next_job = (job_item *) 0x0 > >>#4 0x0804d16f in main (argc=0, argv=0xbfb0cc14) at dird.c:244 > >>244 while ( (jcr = wait_for_next_job(runjob)) ) { > >>ch = -1 > >>jcr = (JCR *) 0x987d8e8 > >>no_signals = 0 > >>test_config = 0 > >>uid = 0x0 > >>gid = 0x0 > >>#0 0x00000000 in ?? () > >>No symbol table info available. > >>#0 0x00000000 in ?? () > >>No symbol table info available. > >>#0 0x00000000 in ?? () > >>No symbol table info available. > >> > >>Thanks for your help, > >>Evelyne -- Best regards, Kern ("> /\ V_V ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_idv37&alloc_id865&op=click _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users