Hi group, Following up on my run in with backing up around a daylight saving time change, the following sets out what the error reports were and why I found it difficult to work out exactly what was causing the problem, and this a discovered by accident via bat.....
A Bacula (v 2.4.4) job blocks all others due to a summer time change. The time change causes the already completed job to be re-run, it doesn't, and blocks. This prevents the error messages indicating that has gone wrong from appearing in the Bacula log files or notification emails - only bat shows this error and only when opened _once_ after the error occurred. The following error reporting behavior obscured what was the problem. 1) Initial run of a full backup job runs and completes successfully. The next job (catalog backup) also runs and completes with out error. 2) Received two emails indicating each job completed. 3) Expected a futher 3 emails indicating the three services, bacula-dir, fd and sd were running. However no such emails were received. 4) Bacula log file showed the same content as the email received at 2) i.e. the last few lines of the log file suggest nothing is amiss: 05-Apr 02:51 bacula-dir JobId 432: Begin pruning Jobs. 05-Apr 02:51 bacula-dir JobId 432: No Jobs found to prune. 05-Apr 02:51 bacula-dir JobId 432: Begin pruning Files. 05-Apr 02:51 bacula-dir JobId 432: No Files found to prune. 05-Apr 02:51 bacula-dir JobId 432: End auto prune. 05-Apr 02:51 bacula-dir JobId 432: AfterJob: run command "/home/mv/bin/bacula/delete_catalog_backup" 5) Received 'intervention needed' emails with the contents below. I may be dense, but I didn't interpret 'Cannot find any appendable volumes' to mean 'Volume "Full-MVLaptop-2009-04-05" already exists' and this is quite possibly my fault for how I have the email alerts configured (I assume I can change their contents?) : 05-Apr 02:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 6) Bacula-tray-monitor showed an ambiguous message indicating the same as the intervention email I received: 7) Opened bat and saw the following line that was my clue about a job trying to run twice: 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 8) Closed and opened bat and did not see this error message again..... 9) The Bacula log file still showed only that the last job had completed successfully, the time stamp was 02:51. 10) Deleted the stalled job using bat console 11) NOW the log file shows the much more informative errors at 2:05, which comes after 2:51.... :) 05-Apr 02:51 bacula-dir JobId 432: Begin pruning Jobs. 05-Apr 02:51 bacula-dir JobId 432: No Jobs found to prune. 05-Apr 02:51 bacula-dir JobId 432: Begin pruning Files. 05-Apr 02:51 bacula-dir JobId 432: No Files found to prune. 05-Apr 02:51 bacula-dir JobId 432: End auto prune. 05-Apr 02:51 bacula-dir JobId 432: AfterJob: run command "/home/mv/bin/bacula/delete_catalog_backup" 05-Apr 02:05 bacula-dir JobId 433: Start Backup JobId 433, Job=MVLaptop.2009-04-05_02.05.00.10 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-dir JobId 433: Using Device "FileDevice" 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 12) I now get one email about the cancelled job and this contains what I wish I had got in the 'intervention needed' emails, as well as the log file: 05-Apr 02:05 bacula-dir JobId 433: Start Backup JobId 433, Job=MVLaptop.2009-04-05_02.05.00.10 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-dir JobId 433: Using Device "FileDevice" 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 02:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 05-Apr 03:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 03:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 05-Apr 05:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 05:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 05-Apr 09:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 09:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 05-Apr 17:05 bacula-dir JobId 433: Error: sql_create.c:400 Volume "Full-MVLaptop-2009-04-05" already exists. 05-Apr 17:05 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 waiting. Cannot find any appendable volumes. Please use the "label" command to create a new Volume for: Storage: "FileDevice" (/media/disk) Pool: Full-Pool Media type: File 06-Apr 07:38 bacula-fd JobId 433: Fatal error: job.c:1814 Comm error with SD. bad response to Append Data. ERR=Interrupted system call 06-Apr 07:38 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 marked to be canceled. 06-Apr 07:38 bacula-sd JobId 433: Job MVLaptop.2009-04-05_02.05.00.10 canceled while waiting for mount on Storage Device ""FileDevice" (/media/disk)". 06-Apr 07:38 bacula-dir JobId 433: Bacula bacula-dir 2.4.4 (28Dec08): 06-Apr-2009 07:38:57 Build OS: i686-pc-linux-gnu suse 11.0 JobId: 433 Job: MVLaptop.2009-04-05_02.05.00.10 Backup Level: Full Client: "client-fd" 2.4.4 (28Dec08) i686-pc-linux-gnu,suse,11.0 FileSet: "Full Set" 2009-03-28 02:05:00 Pool: "Full-Pool" (From Job FullPool override) Storage: "File" (From Job resource) Scheduled time: 05-Apr-2009 02:05:00 Start time: 05-Apr-2009 02:05:02 End time: 06-Apr-2009 07:38:57 Elapsed time: 1 day 5 hours 33 mins 55 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Storage Encryption: no Volume name(s): Volume Session Id: 7 Volume Session Time: 1238640563 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: Canceled Termination: Backup Canceled 13) NOW the Bacula log file shows the same contents as the above email. Apologies if that seems a bit of a whinge, it wasn't intended that way... it just bring back memories of a very unsettling and then very frustrating experience :) HTH Mark ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users