Hello,

Thanks a lot Radoslaw for your responses.

You are obviously right that restarting a job cannot work if it has been deleted from the catalog.

I'm a little confused that a Failed job could however be considered (in some circumstances) as an Incomplete one, and I am wondering how my job would be considered.

Therefore, from a backup of the virtual machine, I was able to return to the situation just after the crash and the reboot of the machine (with my job jobID=25 well recorded in the catalog, in Failed state, since my purge/delete tries have not yet been done at this time).

So, I restarted this job (in restart prompt, this job was listed in Failed jobs list and not in Incomplet job list) : the new job (which is now in Running state with jobId=26) are writing to the last volume ("volume42" from my initial message) apparently following the jobID=25 backuped data (in any case, this is suggested by the increase of volbytes and volfiles for this media).

Could you tell me please if at this point there is any way to know if it is indeed a true restart (from the point of at the time of the crash) or if it started from zero? I mean if the jobID=25 was considered an Incomplete job or a Failed job ?

As a reminder, the jobID = 25 is always given with jobstatus "f" (I guess Failed) by the command list jobid = 25. And the commands list files jobid=25 and list files jobid=26 does not return (for the moment) any filename. Also, the jobfiles field is 0 in for both jobs (for the moment).

Thanks again,

Dan


Le 02/09/2021 à 17:53, Radosław Korzeniewski a écrit :
Hello,

czw., 2 wrz 2021 o 16:07 Dan-Gabriel CALUGARU <dan-gabriel.calug...@ec-lyon.fr <mailto:dan-gabriel.calug...@ec-lyon.fr>> napisał(a):

    Hello everybody,

    I would like to ask for your help to continue the backup of space
    of around 300 TB.

    I'am using Bacula 9.6.7 version.

    I was able to divide this work into several jobs of about 15-20 TB
    (one week for each job) to be able to resume more easily if there
    was a problem.
    After several such jobs successfully completed (I have already
    backed up nearly 250 TB), the machine hosting the bacula server
    crashed while my last backup job (jobID = 25) was running.
    Could you advise me what is the best way to continue in such a case ?

    As additional information, I would note that this job appears with
    Failed status and that it had written (before the crash) on 2
    volumes (which are LTO-7 tape cartridges with a capacity of
    approximately 6TB):
    - about 2TB on the 1st volume "volume41" (which became Full),
    knowing that the previous job (well finished) had already written
    the first 4TB
    - about 1TB on the 2nd volume "volume 42" (which was empty before
    the job, ans allways in Append status)

    I have tried so far:

    1) purge files jobid=25

    but this command seems to have nothing done because jobID=25 was
    still present in the catalog (the outputs of the commands list
    jobid=25 and list joblog jobid=25 have not changed after this command)

    then

    2) delete jobid=25

    who deleted this job from the catalog because I got this message :

    /JobId = 25 and associated records deleted from the catalog./

    and the outputs of the commands list jobid=25 and list joblog
    jobid=25 have changed ("No results to list")

    On the other hand, the information on the two volumes has not
    changed and if I restart with restart jobid=25


To restart the job Bacula requires proper data for the failed job available in the catalog to know the restart point, which you just simply deleted. It won't work that way.

    I have the impression that bacula acts as if it is another job, so
    it continues to write on the 2nd volume ("volume 42") after the
    1TB already written (by the previous Failed job). Therefore, the
    space written by the Failes job (jobID = 25) no longer seems to be
    used and will therefore remain "lost".


Yes. When no information about a failed job is available then Bacula is unable to restart that job, so it just starts it from scratch. For any successful job restart it has to be in an "incomplete" state. Any other state restarts a job from the start.


    Instead, I would like bacula reuse this space (the 2TB on the 1st
    volume "volume41" and the 1TB on the 2nd volume "volume 42").


It is absolutely possible with "incomplete" jobs feature and jobs restart. But you should never delete an incomplete job from the catalog when you want it to be restartable.

    Indeed, from what I understood, for Failed jobs, we have to start
    from scratch, but I would like to re-use the space it had written
    by Failed job (because unusable).


As I wrote above, you can reuse already available data when your job is incomplete and you restart it without deleting. :)


    Do you have a technique for doing this ?

    Thank you in advance for any response

Just take a look at Bacula manual:

/8.2.12 Incomplete Jobs
During a backup, if the Storage daemon experiences disconnection with the File daemon during backup (normally a comm line problem or possibly an FD failure), under conditions that the SD determines to be safe it will make the failed job as Incomplete rather than failed. This is done only if there is sufficient valid backup data that was written to the Volume. The advantage of an Incomplete job is that it can be restarted by the new bconsole restart command from the point where it left off rather than from the beginning of the jobs as is the case with a cancel.
/

best regrads
--
Radosław Korzeniewski
rados...@korzeniewski.net <mailto:rados...@korzeniewski.net>


--
Dan-Gabriel CALUGARU
IR en Calcul Scientifique (CNRS)
Dr de Mathématiques et Applications

Laboratoire de Mécanique des Fluides et d'Acoustique
UMR 5509 CNRS - ECL - UCBL - INSA Lyon - Univ. de Lyon
Bâtiment I11 - bureau 11098
ECOLE CENTRALE de LYON
36, avenue Guy de Collongue
69134 ECULLY

tel: +33 (0)4 72 18 61 73

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to