[Bacula-users] Rerun Failed Levels not working as supposed ?

Brice Figureau Wed, 06 Apr 2005 02:22:36 -0700

Hi,

First: I'm (still) happily running 1.36.1, and the more I use Bacula,
the better I find it.


Following the good advice to have 'Rerun Failed Levels = yes' in my Job
definition along with a a 'Run Before Job' that check if the host is up
and running, I encountered the following problem.

I had a FD that was shutdown for a long time, thus each of its jobs were
failing as shown by this SQL request:
mysql> select JobId,Level,ClientId,JobStatus,StartTime,PoolId,FileSetId from 
Job where name='pierre';
+-------+-------+----------+-----------+---------------------+--------+-----------+
| JobId | Level | ClientId | JobStatus | StartTime           | PoolId | 
FileSetId |
+-------+-------+----------+-----------+---------------------+--------+-----------+
|    41 | F     |        3 | T         | 2005-03-16 01:10:40 |      1 |         
4 |
|    45 | I     |        3 | T         | 2005-03-17 01:08:36 |      3 |         
4 |
|    51 | I     |        3 | E         | 2005-03-18 01:08:12 |      3 |         
4 |
|    57 | I     |        3 | T         | 2005-03-18 09:56:06 |      3 |         
4 |
|    61 | I     |        3 | E         | 2005-03-18 21:33:09 |      3 |         
4 |
|    68 | D     |        3 | T         | 2005-03-21 21:36:29 |      2 |         
4 |
|    75 | I     |        3 | T         | 2005-03-22 21:33:58 |      3 |         
4 |
|    82 | I     |        3 | T         | 2005-03-23 21:33:57 |      3 |         
4 |
|    89 | I     |        3 | E         | 2005-03-24 21:34:09 |      3 |         
4 |
|    96 | I     |        3 | E         | 2005-03-25 21:32:29 |      3 |         
4 |
|   103 | D     |        3 | E         | 2005-03-28 21:36:04 |      2 |         
4 |
|   110 | I     |        3 | E         | 2005-03-29 21:35:13 |      3 |         
4 |
|   117 | I     |        3 | E         | 2005-03-30 21:33:17 |      3 |         
4 |
|   124 | I     |        0 | f         | 2005-03-31 21:35:42 |      0 |         
0 |
|   131 | I     |        0 | f         | 2005-04-01 21:35:05 |      0 |         
0 |
|   138 | F     |        0 | f         | 2005-04-04 22:56:38 |      0 |         
0 |
|   145 | I     |        3 | T         | 2005-04-05 21:36:32 |      3 |         
4 |
+-------+-------+----------+-----------+---------------------+--------+-----------+

I added the 'Run Before Job' on 3/31/2005, and the Error code, changed from E 
(error) to f (failed).

Two days ago, the backup was a Full backup, but the host was down.
Yesterday the backup was an Incremental backup and this particular FD
was up and running, I expected it to upgrade its backup to full, but no:

05-Apr 21:36 arsenic-dir: Start Backup JobId 145,
Job=pierre.2005-04-05_21.30.03
05-Apr 21:36 pierre-fd: Since time adjusted by -1 seconds.
05-Apr 21:36 arsenic-sd: Volume "Inc-0005" previously written, moving to
end of data.
05-Apr 21:46 arsenic-dir: Bacula 1.36.1 (26Nov04): 05-Apr-2005 21:46:43
  JobId:                  145
  Job:                    pierre.2005-04-05_21.30.03
  Backup Level:           Incremental, since=2005-03-23 21:33:57
  Client:                 pierre-fd
  FileSet:                "Pierre Set" 2005-03-15 17:18:12
  Pool:                   "Inc-Pool"
  Storage:                "File"
  Start time:             05-Apr-2005 21:36:32
  End time:               05-Apr-2005 21:46:43
  FD Files Written:       864
  SD Files Written:       864
  FD Bytes Written:       2,720,456,391
  SD Bytes Written:       2,720,642,779
  Rate:                   4452.5 KB/s
  Software Compression:   18.8 %
  Volume name(s):         Inc-0005
  Volume Session Id:      16
  Volume Session Time:    1112282457
  Last Volume Bytes:      3,153,354,566
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Backup OK

After reading the code, I found that the problem was indeed the
following:

* The failed Job because the 'Run Before Job' returned an error have a
Job status of Failed and no CliendId nor FileSetId.

* The 'Rerun Failed Levels' part of get_level_since_time() in fd_cmds.c
does not promote the job to Full or Differential because it searches for
a Job that has a valid ClientId.

Is it a design limitation of the Run Before Job or is it a bug ?
I can file a bug report if necessary.

Does anyone have a workaround (besides removing my 'Run Before Job') ?
And finally how can I fix this and force an upgrade to Full for this
FD ?

Thanks for your answer,
-- 
Brice Figureau
Days of Wonder



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] Rerun Failed Levels not working as supposed ?

Reply via email to