Hello,
Thanks a lot Radoslaw for your responses.
You are obviously right that restarting a job cannot work if it has been
deleted from the catalog.
I'm a little confused that a Failed job could however be considered (in
some circumstances) as an Incomplete one, and I am wondering how my job
would be considered.
Therefore, from a backup of the virtual machine, I was able to return to
the situation just after the crash and the reboot of the machine (with
my job jobID=25 well recorded in the catalog, in Failed state, since my
purge/delete tries have not yet been done at this time).
So, I restarted this job (in restart prompt, this job was listed in
Failed jobs list and not in Incomplet job list) : the new job (which is
now in Running state with jobId=26) are writing to the last volume
("volume42" from my initial message) apparently following the jobID=25
backuped data (in any case, this is suggested by the increase of
volbytes and volfiles for this media).
Could you tell me please if at this point there is any way to know if it
is indeed a true restart (from the point of at the time of the crash) or
if it started from zero? I mean if the jobID=25 was considered an
Incomplete job or a Failed job ?
As a reminder, the jobID = 25 is always given with jobstatus "f" (I
guess Failed) by the command list jobid = 25.
And the commands list files jobid=25 and list files jobid=26 does not
return (for the moment) any filename. Also, the jobfiles field is 0 in
for both jobs (for the moment).
Thanks again,
Dan
Le 02/09/2021 à 17:53, Radosław Korzeniewski a écrit :
Hello,
czw., 2 wrz 2021 o 16:07 Dan-Gabriel CALUGARU
<dan-gabriel.calug...@ec-lyon.fr
<mailto:dan-gabriel.calug...@ec-lyon.fr>> napisał(a):
Hello everybody,
I would like to ask for your help to continue the backup of space
of around 300 TB.
I'am using Bacula 9.6.7 version.
I was able to divide this work into several jobs of about 15-20 TB
(one week for each job) to be able to resume more easily if there
was a problem.
After several such jobs successfully completed (I have already
backed up nearly 250 TB), the machine hosting the bacula server
crashed while my last backup job (jobID = 25) was running.
Could you advise me what is the best way to continue in such a case ?
As additional information, I would note that this job appears with
Failed status and that it had written (before the crash) on 2
volumes (which are LTO-7 tape cartridges with a capacity of
approximately 6TB):
- about 2TB on the 1st volume "volume41" (which became Full),
knowing that the previous job (well finished) had already written
the first 4TB
- about 1TB on the 2nd volume "volume 42" (which was empty before
the job, ans allways in Append status)
I have tried so far:
1) purge files jobid=25
but this command seems to have nothing done because jobID=25 was
still present in the catalog (the outputs of the commands list
jobid=25 and list joblog jobid=25 have not changed after this command)
then
2) delete jobid=25
who deleted this job from the catalog because I got this message :
/JobId = 25 and associated records deleted from the catalog./
and the outputs of the commands list jobid=25 and list joblog
jobid=25 have changed ("No results to list")
On the other hand, the information on the two volumes has not
changed and if I restart with restart jobid=25
To restart the job Bacula requires proper data for the failed job
available in the catalog to know the restart point, which you just
simply deleted. It won't work that way.
I have the impression that bacula acts as if it is another job, so
it continues to write on the 2nd volume ("volume 42") after the
1TB already written (by the previous Failed job). Therefore, the
space written by the Failes job (jobID = 25) no longer seems to be
used and will therefore remain "lost".
Yes. When no information about a failed job is available then Bacula
is unable to restart that job, so it just starts it from scratch. For
any successful job restart it has to be in an "incomplete" state. Any
other state restarts a job from the start.
Instead, I would like bacula reuse this space (the 2TB on the 1st
volume "volume41" and the 1TB on the 2nd volume "volume 42").
It is absolutely possible with "incomplete" jobs feature and jobs
restart. But you should never delete an incomplete job from the
catalog when you want it to be restartable.
Indeed, from what I understood, for Failed jobs, we have to start
from scratch, but I would like to re-use the space it had written
by Failed job (because unusable).
As I wrote above, you can reuse already available data when your job
is incomplete and you restart it without deleting. :)
Do you have a technique for doing this ?
Thank you in advance for any response
Just take a look at Bacula manual:
/8.2.12 Incomplete Jobs
During a backup, if the Storage daemon experiences disconnection with
the File daemon during
backup (normally a comm line problem or possibly an FD failure), under
conditions that the SD
determines to be safe it will make the failed job as Incomplete rather
than failed. This is done
only if there is sufficient valid backup data that was written to the
Volume. The advantage of
an Incomplete job is that it can be restarted by the new bconsole
restart command from the
point where it left off rather than from the beginning of the jobs as
is the case with a cancel.
/
best regrads
--
Radosław Korzeniewski
rados...@korzeniewski.net <mailto:rados...@korzeniewski.net>
--
Dan-Gabriel CALUGARU
IR en Calcul Scientifique (CNRS)
Dr de Mathématiques et Applications
Laboratoire de Mécanique des Fluides et d'Acoustique
UMR 5509 CNRS - ECL - UCBL - INSA Lyon - Univ. de Lyon
Bâtiment I11 - bureau 11098
ECOLE CENTRALE de LYON
36, avenue Guy de Collongue
69134 ECULLY
tel: +33 (0)4 72 18 61 73
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users