You may also find running sudo strace -fF -o list.out ./bconsole helpful. Once it hangs - give it a few minutes - then bomb out and post the list.out file here for us to take a look at :-)
-----Original Message----- From: Dep, Khushil (GE Money) Sent: 30 October 2007 15:50 To: Dep, Khushil (GE Money); Johan van Vliet; bacula-users@lists.sourceforge.net Subject: RE: [Bacula-users] Hanging after starting job and executing status Here's the output of mine: sudo strace -cfF -o list.out ./bconsole % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 38.93 0.038702 4838 8 4 waitpid 17.11 0.017006 1546 11 1 futex 15.53 0.015440 102 151 write 12.40 0.012325 37 330 read 3.17 0.003149 787 4 execve 2.30 0.002289 41 56 rt_sigprocmask 1.63 0.001620 32 51 close 1.62 0.001609 322 5 clone 1.45 0.001441 34 42 7 open 1.08 0.001072 12 91 rt_sigaction 0.79 0.000783 19 42 old_mmap 0.70 0.000700 44 16 munmap 0.53 0.000528 176 3 2 connect 0.35 0.000344 10 36 fstat64 0.29 0.000290 13 22 mmap2 0.29 0.000289 16 18 2 ioctl 0.26 0.000256 28 9 mprotect 0.22 0.000223 12 18 9 stat64 0.14 0.000137 11 12 time 0.12 0.000121 10 12 brk 0.12 0.000118 12 10 gettimeofday 0.11 0.000111 28 4 sigreturn 0.11 0.000105 13 8 4 access 0.10 0.000104 15 7 uname 0.09 0.000086 29 3 socket 0.08 0.000075 8 9 select 0.06 0.000059 8 7 _llseek 0.06 0.000059 7 8 fcntl64 0.05 0.000053 13 4 set_thread_area 0.05 0.000045 45 1 _sysctl 0.05 0.000045 11 4 clock_gettime 0.04 0.000035 9 4 dup2 0.03 0.000033 17 2 pipe 0.03 0.000026 13 2 setsockopt 0.02 0.000024 6 4 nanosleep 0.02 0.000023 12 2 geteuid32 0.02 0.000018 9 2 getuid32 0.02 0.000016 8 2 getegid32 0.01 0.000014 7 2 getrlimit 0.01 0.000012 6 2 getgid32 0.01 0.000009 9 1 getpgrp 0.01 0.000008 8 1 getpid 0.01 0.000008 8 1 getppid 0.01 0.000008 8 1 set_tid_address ------ ----------- ----------- --------- --------- ---------------- 100.00 0.099418 1028 29 total That's without errors. If you run that against your bconsole you should see where it's bombing? Anyone else? -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dep, Khushil (GE Money) Sent: 30 October 2007 15:38 To: Johan van Vliet; bacula-users@lists.sourceforge.net Subject: Re: [Bacula-users] Hanging after starting job and executing status Sound like something to do with the port forwarding to me. From shell on client1 or client2 can you telnet to client3 on port 9103? If not something screwey is going on with your port forwarding. You could alwas turn on strace.... -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Johan van Vliet Sent: 30 October 2007 15:00 To: bacula-users@lists.sourceforge.net Subject: [Bacula-users] Hanging after starting job and executing status Hi, I run a small site with 3 clients. Client 3 also trippled as director and storage. Like this: [CLIENT1]--+ +--[INTERNET]--[CLIENT3/DIR/STORAGE] [CLIENT2]--+ Client 1 and client 2 are Slackware Linux and client 3 is FreeBSD. Because I ren out of diskspace I moved the storage to another (4th machine; also FreeBSD). I also (stupid I know) upgraded from 2.2.4 to 2.2.5: [CLIENT1]--+ +--[INTERNET]--[CLIENT3/DIR]--[STORAGE] [CLIENT2]--+ Bacause I've only one public IP address I also installed portfwd on client 3 to forward 9103 from client 3 to storage. Sofar so good. As a first test I backup'd client 3. This worked.! Then I tried to backup client 1 and it stopped half way on /dev/cciss/c0...something.... Second attempt it stopped one another file. etc. So I tried the client 2. Same result. Long story short; I can reproduce the hanging when I run a backup of the remote clients (1 or 2) and then type "status storage" in bconsole. Like this: *run A job name must be specified. The defined Job resources are: 1: client1-job 2: client2-job 3: client3-job 4: BackupCatalog 5: RestoreFiles Select Job resource (1-5): 3 Run Backup job JobName: client2-job Level: Incremental Client: client2-fd FileSet: client2 Set Pool: Default (From Job resource) Storage: File (From Job resource) When: 2007-10-30 13:24:01 Priority: 15 OK to run? (yes/mod/no): yes Job queued. JobId=16 Note the "messages" here show me that the backup is starting. Also I can see it status running in te jobs list; as soon as I type: *status storage Automatically selected Storage: File Connecting to Storage daemon File at client3:9103 it hangs and and doesn't return to the * prompt. The "bacula-fd -f -d100 -c <config>" and "bacula-sd -f -d100 -c <config>" show output until the "status restore". The only way to get the prompt back is by killing the bacula-sd. Any insights are welcome about what can cause this. J. ------------------------------------------------------------------------ - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ------------------------------------------------------------------------ - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users