In message <[EMAIL PROTECTED]> you wrote:
>
>   W> Any idea what to look for?
> 
> Strange -- originally you said that it core dumped, so running it under gdb
> should have caused gdb to catch that.  Or was the core dump only from strace?

Looks like this was the case. I didn't see any core  dumps  again  (=
without running under strace).

> Anyway, you can send Ctrl-c to the gdb to wake it up and then try
> 
> thread apply all bt
> 
> which should show which thread is hanging and where.

This gives:

(gdb) run -f -c  /etc/bacula/bacula-sd.conf
Starting program: /usr/local/src/bacula/src/stored/bacula-sd -f -c  
/etc/bacula/bacula-sd.conf
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xa08000
[Thread debugging using libthread_db enabled]
[New Thread -1208940864 (LWP 12410)]
Detaching after fork from child process 12413.
[New Thread -1211040848 (LWP 12414)]
[New Thread -1221530704 (LWP 12415)]
Detaching after fork from child process 12416.
Detaching after fork from child process 12426.
[New Thread -1232020560 (LWP 12438)]

Program received signal SIGINT, Interrupt.
[Switching to Thread -1208940864 (LWP 12410)]
0x00a08402 in __kernel_vsyscall ()
(gdb) thread apply all bt

Thread 4 (Thread -1232020560 (LWP 12438)):
#0  0x00a08402 in __kernel_vsyscall ()
#1  0x00be38f6 in __nanosleep_nocancel () from /lib/libpthread.so.0
#2  0x08073afc in bmicrosleep (sec=1, usec=0) at bsys.c:54
#3  0x0805c894 in handle_connection_request (arg=0x827e980) at dircmd.c:197
#4  0x0808acc6 in workq_server (arg=0x80a8800) at workq.c:347
#5  0x00bdeb80 in start_thread () from /lib/libpthread.so.0
#6  0x001eb9ce in clone () from /lib/libc.so.6

Thread 3 (Thread -1221530704 (LWP 12415)):
#0  0x00a08402 in __kernel_vsyscall ()
#1  0x00be0a1c in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#2  0x08089c02 in watchdog_thread (arg=0x0) at watchdog.c:296
#3  0x00bdeb80 in start_thread () from /lib/libpthread.so.0
#4  0x001eb9ce in clone () from /lib/libc.so.6

Thread 2 (Thread -1211040848 (LWP 12414)):
#0  0x00a08402 in __kernel_vsyscall ()
#1  0x00be38f6 in __nanosleep_nocancel () from /lib/libpthread.so.0
#2  0x08073afc in bmicrosleep (sec=5, usec=0) at bsys.c:54
#3  0x08058d84 in DEVICE::rewind (this=0x827f5d8, dcr=0x827fe98) at dev.c:668
#4  0x0806448b in read_dev_volume_label (dcr=0x827fe98) at label.c:101
#5  0x0804bcb6 in device_initialization (arg=0x0) at stored.c:476
#6  0x00bdeb80 in start_thread () from /lib/libpthread.so.0
#7  0x001eb9ce in clone () from /lib/libc.so.6

Thread 1 (Thread -1208940864 (LWP 12410)):
#0  0x00a08402 in __kernel_vsyscall ()
#1  0x001e4221 in ___newselect_nocancel () from /lib/libc.so.6
#2  0x08076c67 in bnet_thread_server (addrs=0x8272778, max_clients=41, 
client_wq=0x80a8800, 
    handle_client_request=0x805c674 <handle_connection_request(void*)>) at 
bnet_server.c:148
#3  0x0804c72e in main (argc=Variable "argc" is not available.
) at stored.c:241



Ummm.... read_dev_volume_label()? DEVICE::rewind()? 
Let's see...

Right. A 'mt -f ... rewind' hangs, too. so this is in fact  a  kernel
(driver) problem, and not bacula's fault.

Thanks.

Best regards,

Wolfgang Denk

-- 
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [EMAIL PROTECTED]
Heavier than air flying machines are impossible.
                    -- Lord Kelvin, President, Royal Society, c. 1895


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to