Hello,

This appears to be a deadlock situation, and seems to be triggered by a 
watchdog timeout, which means you have probably set some maximum time limit 
for a job.  

Though the deadlock could be related to version 1.36.3, I'd be a bit 
surprised. At this point, I cannot exclude a 1.36.3 specific problem, so I'll 
carefully check that after returning from vacation.

I'd appreciate it if you would submit this traceback as a bug report along 
with your Director's conf file.

On Tuesday 24 May 2005 13:25, Masopust Christian wrote:
> Yesterday in the evening, just when starting some jobs my director
> again freezes...
>
> here's the output of btraceback (my system is Fedora Core 3, Bacula is
> 1.36.3):
>
> From [EMAIL PROTECTED]  Mon May 23 22:01:32 2005
> Return-Path: <[EMAIL PROTECTED]>
> Received: from atpcc7fc.sie.siemens.at (atpcc7fc.sie.siemens.at
> [127.0.0.1]) by atpcc7fc.sie.siemens.at (8.13.1/8.13.1) with SMTP id
> j4NK1VFi027151
>         for <[EMAIL PROTECTED]>; Mon, 23 May 2005 22:01:31 +0200
> Message-Id: <[EMAIL PROTECTED]>
> From: [EMAIL PROTECTED]
> Subject: Bacula GDB traceback of bacula-dir
> Sender: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Date: Mon, 23 May 2005 22:01:31 +0200
> Status: R
>
> Using host libthread_db library "/lib/libthread_db.so.1".
> [Thread debugging using libthread_db enabled]
> [New Thread 16384 (LWP 3346)]
> [New Thread 32769 (LWP 3351)]
> [Thread debugging using libthread_db enabled]
> [New Thread 16384 (LWP 3346)]
> [New Thread 32769 (LWP 3351)]
> [Thread debugging using libthread_db enabled]
> [New Thread 16384 (LWP 3346)]
> [New Thread 32769 (LWP 3351)]
> [New Thread 16386 (LWP 3352)]
> [New Thread 32771 (LWP 3353)]
> [New Thread 19726340 (LWP 26151)]
> [New Thread 19742725 (LWP 26152)]
> [New Thread 19759110 (LWP 26164)]
> [New Thread 19775495 (LWP 26172)]
> [New Thread 19791880 (LWP 26180)]
> [New Thread 19808265 (LWP 26203)]
> [New Thread 19824650 (LWP 26267)]
> [New Thread 19841035 (LWP 26294)]
> [New Thread 19857420 (LWP 26320)]
> [New Thread 19873805 (LWP 26381)]
> [New Thread 19890190 (LWP 26411)]
> [New Thread 19906575 (LWP 26434)]
> 0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> $1 = "atpcc7fc-dir", '\0' <repeats 17 times>
> $2 = 0x80b5230 "bacula-dir"
> $3 = 0x80b5dd0 "/opt/bacula/sbin/"
> $4 = "MySQL"
> $5 = 0x80a321c "1.36.3 (22 April 2005)"
> $6 = 0x809bfb8 "i686-redhat-linux-gnu"
> $7 = 0x809bfb1 "redhat"
> $8 = 0x809bfa4 "(Heidelberg)"
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c9720 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
> #3  0x004c614e in pthread_mutex_lock () from /lib/i686/libpthread.so.0
> #4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at jobq.c:240
> #5  0x080566d8 in run_job (jcr=0x80fc570) at job.c:140
> #6  0x0804c034 in main (argc=0, argv=0x8090b55) at dird.c:241
>
> Thread 16 (Thread 19906575 (LWP 26434)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1254097504, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x8117e88) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 15 (Thread 19890190 (LWP 26411)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1252000352, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x8116c68) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 14 (Thread 19873805 (LWP 26381)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1249903200, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x8115a48) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 13 (Thread 19857420 (LWP 26320)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1247806048, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x80fdf48) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 12 (Thread 19841035 (LWP 26294)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1245708896, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x80fde78) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 11 (Thread 19824650 (LWP 26267)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1243611744, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x80fddf8) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 10 (Thread 19808265 (LWP 26203)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1241514592, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x80fdd78) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 9 (Thread 19791880 (LWP 26180)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1239417440, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x80fdcc8) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 8 (Thread 19775495 (LWP 26172)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1237320288, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x810b308) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 7 (Thread 19759110 (LWP 26164)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x08080491 in new_jcr (size=-1235223136, daemon_free_jcr=0xfffffffc) at
> jcr.c:218
> #6  0x0806d182 in new_control_jcr (base_name=0x809bf69 "*Console*",
> job_type=-4) at ua_server.c:90
> #7  0x0806d38b in handle_UA_client_request (arg=0x810b288) at
> ua_server.c:122
> #8  0x0808ee1b in workq_server (arg=0x80b4480) at workq.c:347
> #9  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #10 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 6 (Thread 19742725 (LWP 26152)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b4ee0) at rwlock.c:231
> #4  0x0808e304 in wd_lock () at watchdog.c:305
> #5  0x0808e5e4 in unregister_watchdog (wd=0x80da6b0) at watchdog.c:200
> #6  0x0808f33d in stop_btimer (wid=0x80e7470) at btimers.c:246
> #7  0x0804c63b in authenticate_storage_daemon (jcr=0x80ef9f0,
> store=0x80b85d8) at authenticate.c:103
> #8  0x08059dc5 in connect_to_storage_daemon (jcr=0x80ef9f0,
> retry_interval=10, max_retry_time=1800, verbose=1)
>     at msgchan.c:89
> #9  0x0804da74 in do_backup (jcr=0x80ef9f0) at backup.c:145
> #10 0x08056204 in job_thread (arg=0x80ef9f0) at job.c:215
> #11 0x080583bd in jobq_server (arg=0x80b4300) at jobq.c:444
> #12 0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #13 0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 5 (Thread 19726340 (LWP 26151)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c3fab in [EMAIL PROTECTED] () from
> /lib/i686/libpthread.so.0
> #3  0x08087a7a in rwl_writelock (rwl=0x80b46c0) at rwlock.c:231
> #4  0x0807fd00 in lock_jcr_chain () at jcr.c:544
> #5  0x080588e1 in jobq_server (arg=0x80b4300) at jobq.c:582
> #6  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #7  0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 4 (Thread 32771 (LWP 3353)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c9720 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
> #3  0x004c614e in pthread_mutex_lock () from /lib/i686/libpthread.so.0
> #4  0x080804f6 in get_next_jcr (prev_jcr=0xfffffffc) at jcr.c:581
> #5  0x08080619 in jcr_timeout_check (self=0x80c3360) at jcr.c:615
> #6  0x0808e533 in watchdog_thread (arg=0x0) at watchdog.c:257
> #7  0x004c4ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #8  0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 3 (Thread 16386 (LWP 3352)):
> #0  0x0042f251 in select () from /lib/i686/libc.so.6
> #1  0x00000006 in ?? ()
> #2  0x080cf47c in ?? ()
> #3  0xb7f572f0 in ?? ()
> #4  0x00000000 in ?? ()
>
> Thread 2 (Thread 32769 (LWP 3351)):
> #0  0x0042cf7a in poll () from /lib/i686/libc.so.6
> #1  0x004c54c0 in __pthread_manager () from /lib/i686/libpthread.so.0
> #2  0x0043661a in clone () from /lib/i686/libc.so.6
>
> Thread 1 (Thread 16384 (LWP 3346)):
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> #2  0x004c9720 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
> #3  0x004c614e in pthread_mutex_lock () from /lib/i686/libpthread.so.0
> #4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at jobq.c:240
> #5  0x080566d8 in run_job (jcr=0x80fc570) at job.c:140
> #6  0x0804c034 in main (argc=0, argv=0x8090b55) at dird.c:241
> #0  0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
> No symbol table info available.
> #1  0x004c7708 in __pthread_wait_for_restart_signal () from
> /lib/i686/libpthread.so.0
> No symbol table info available.
> #2  0x004c9720 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
> No symbol table info available.
> #3  0x004c614e in pthread_mutex_lock () from /lib/i686/libpthread.so.0
> No symbol table info available.
> #4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at jobq.c:240
> 240        if ((stat = pthread_mutex_lock(&jq->mutex)) != 0) {
> Current language:  auto; currently c++
> stat = 135251312
> sched_pkt = (wait_pkt *) 0xfffffffc
> item = (jobq_item_t *) 0x80fc570
> li = (jobq_item_t *) 0x7f
> wtime = -1
> id = 135251948
> #5  0x080566d8 in run_job (jcr=0x80fc570) at job.c:140
> 140        if ((stat = jobq_add(&job_queue, jcr)) != 0) {
> be = {<SMARTALLOC> = {<No data fields>}, buf_ = 0x80fb448
> "ðù\016\bpÅ\017\b\001", berrno_ = 1}
> stat = 134822556
> errstat = 134822556
> JobId = 346
> #6  0x0804c034 in main (argc=0, argv=0x8090b55) at dird.c:241
> 241           run_job(jcr);                   /* run job */
> jcr = (JCR *) 0x80fc570
> test_config = 0
> ch = 135251312
> no_signals = 0
> uid = 0x0
> gid = 0x0
> #0  0x00000000 in ?? ()
> No symbol table info available.
>
>
> any idea??
>
> what's your practice? do you restart bacula every day? should i?
>
> Thanks a lot,
> Christian

-- 
Best regards,

Kern

  (">
  /\
  V_V


-------------------------------------------------------
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to