On 10/16/2013 5:43 pm, David Newman wrote: > On 10/16/13 12:44 PM, dweimer wrote: >> On 10/16/2013 2:13 pm, David Newman wrote: >>> On 10/14/13 2:44 AM, Martin Simmons wrote: >>>>>>>>> On Sun, 13 Oct 2013 18:25:07 -0700, David Newman said: >>>>> >>>>> On 10/9/13 4:41 PM, David Newman wrote: >>>>>> FreeBSD 9.2-RELEASE, bacula-client-5.2.12_3 installed from ports >>>>>> >>>>>> Ever since upgrading this host to FreeBSD 9.2, bacula-fd crashes >>>>>> as >>>>>> soon >>>>>> as bacula-dir starts a backup job. The entry in /var/log/messages >>>>>> is: >>>>>> >>>>>> Oct 9 16:25:50 o bacula-fd: Bacula interrupted by signal 0: >>>>>> UNKNOWN >>>>>> SIGNAL >>>>>> >>>>>> Backups worked fine on this host running FreeBSD 9.1 and other >>>>>> hosts >>>>>> upgraded to FreeBSD 9.2 run backups OK. >>>>>> >>>>>> I've done the uninstall/reinstall thing with the bacula-client >>>>>> port, >>>>>> but >>>>>> that made no difference. >>>>>> >>>>>> Thanks in advance for troubleshooting clues. >>>>>> >>>>>> dn >>>>> >>>>> Is there a Wireshark decode for Bacula? >>>>> >>>>> I'm still stuck on this problem, and need more info on what's >>>>> causing >>>>> that UNKNOWN SIGNAL error. Wireshark 1.8.6 just shows strings of >>>>> bytes >>>>> for the Bacula stuff. >>>>> >>>>> Thanks. >>>>> >>>>> dn >>>> >>>> A wireshark decode won't help much here because problems like this >>>> must be in >>>> the fd itself. >>>> >>>> Try attaching gdb to the bacula-fd process and see if it catches the >>>> mysterious signal (see >>>> http://www.bacula.org/5.2.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000). >>> >>> No luck with this. Per that URL, I've put the btraceback.gdb file in >>> the >>> same directory as the bacula-fd executable on the client (in this >>> case, >>> /usr/local/sbin) and made the .gdb file executable. >>> >>> At run time it produces this error: >>> >>> /usr/local/sbin/btraceback.gdb:1: Error in sourced command file: >>> No symbol table is loaded. Use the "file" command. >>> >>> That's problem 1. Problem 2 is that the syntax given for capturing >>> STDERR and STDOUT -- 2>\&1 -- doesn't work on either csh (root's >>> default >>> on FreeBSD) or bash. >>> >>> Any ideas on remedying either issue? >>> >>> Thanks. >>> >>> dn >>> >> >> I have 2>&1, no backslash before the ampersand used with /bin/sh in >> several cron scripts, on FreeBSD seems to do the job > > Thanks, that works for capturing STDERR and STDOUT. > > But that .gdb file still produces the same error: > > /usr/local/sbin/btraceback.gdb:1: Error in sourced command file: > No symbol table is loaded. Use the "file" command. > > So, I'm still blocked on debugging this issue. > > dn > >
Well one of my FreeBSD 9.2 systems decided to take a new route to this problem. My backups starting failing this morning, without the bacula-fd process stopping, it starts the client run before job script, then after two hours fails with no response from the client. 2013-10-30 07:52:34 bacula-dir JobId 291: Start Backup JobId 291, Job=Webmail-Backup.2013-10-30_07.52.32_46 2013-10-30 07:52:34 bacula-dir JobId 291: Using Device "FileStorage" 2013-10-30 07:52:35 webmail-fd JobId 291: shell command: run ClientRunBeforeJob "/root/bacula/before.sh" 2013-10-30 07:52:35 webmail-fd JobId 291: ClientRunBeforeJob: 2013-10-30 07:52:35 webmail-fd JobId 291: ClientRunBeforeJob: Create PostgreSQL Backup... 2013-10-30 07:52:35 webmail-fd JobId 291: ClientRunBeforeJob: 2013-10-30 07:52:35 webmail-fd JobId 291: ClientRunBeforeJob: Getting Database List 2013-10-30 07:52:35 webmail-fd JobId 291: ClientRunBeforeJob: 2013-10-30 09:58:46 bacula-dir JobId 291: Fatal error: Socket error on ClientRunBeforeJob command: ERR=Connection reset by peer 2013-10-30 09:58:46 bacula-dir JobId 291: Fatal error: Client "webmail-fd" RunScript failed. 2013-10-30 09:58:46 bacula-dir JobId 291: Fatal error: Network error with FD during Backup: ERR=Connection reset by peer 2013-10-30 09:58:47 bacula-dir JobId 291: Fatal error: No Job status returned from FD. 2013-10-30 09:58:47 bacula-dir JobId 291: Error: Bacula bacula-dir 5.2.12 (12Sep12): Build OS: amd64-portbld-freebsd9.2 freebsd 9.2-RELEASE JobId: 291 Job: Webmail-Backup.2013-10-30_07.52.32_46 Backup Level: Incremental, since=2013-10-29 00:07:02 Client: "webmail-fd" 5.2.12 (12Sep12) amd64-portbld-freebsd9.2,freebsd,9.2-RELEASE FileSet: "WebmailZFS-FileSet" 2013-09-27 13:12:07 Pool: "File" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "File" (From Pool resource) Scheduled time: 30-Oct-2013 07:52:30 Start time: 30-Oct-2013 07:52:34 End time: 30-Oct-2013 09:58:47 Elapsed time: 2 hours 6 mins 13 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 6 Volume Session Time: 1383098903 Last Volume Bytes: 27,632,643,492 (27.63 GB) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: OK Termination: *** Backup Error *** When I check this server, the client run before job script completed, all the database dumps, were successful, and the ZFS snapshots that follow the Database dumps complete as well. However Bacula stops returning the script's status. This server was running fine on up through the full backup done Monday morning, but now comes right back to this problem on every attempt to backup today. A reboot didn't help, trying a full backup instead of incremental made no difference. Canceled one of the attempts, and restarted after removing the client run before script, its now backing up files just fine. so I have temporarily setup a cron job to run 30 minutes before backup to execute my database backups and zfs snapshots. and removed the client run before job. I can find no errors logged on the server running the bacula-fd or the bacula server with the exception of the timeout error message. Tried adding heartbeat interval of 1 minute on the client, that didn't help either. -- Thanks, Dean E. Weimer http://www.dweimer.net/ ------------------------------------------------------------------------------ Android is increasing in popularity, but the open development platform that developers love is also attractive to malware creators. Download this white paper to learn more about secure code signing practices that can help keep Android apps secure. http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users