Happy new year, everyone!

I have a testing environment that I'm assembling for a client. Bacula
15.0.2, running in a rocky linux vm on a synology DS423+.

I have tested backups with a windows FD using a windows VM running windows
11. This went well. A few days ago I added one of my windows 11
workstations as an FD. This FD has about 1.44TB of space used. The FD name
is 'akita'.

Because I'd just added the FD client to akita, I started a full backup.
Overnight, the routinely scheduled incremental backup queued up behind the
full backup. Looking at the logs, the scheduled incremental waited until 20
seconds after the manually started full backup of akita had concluded. In
such a situation I would expect an incremental to complete with very few
files. However, this is not what happened. The incremental was upgraded to
a full because 'no suitable full backup was found'. This would be
reasonable, except such full backup literally just finished. Worse, the
upgraded incremental had a number of VSS writer errors that resulted in a
backup that was incomplete. The first, correct full backup clocked in at
1.3TB in size. The erroneously upgraded incremental backup was about 300GB.
Even worse, the second full finished with a T status, indicating a
successful backup. If I hadn't known better, I would have thought that this
was a good backup based on the status.

What on earth is going on here? Why didn't bacula recognize the previously
ran full backup as valid? Any idea why many of the VSS writers failed? It's
particularly concerning that the job finished with a T status despite so
many VSS writers failing, and an obvious incomplete backup state (to the
eye of an operator with knowledge of the system in question). Terrifyingly,
this most recent 'full' backup is now considered to be the latest full
backup. A subsequent incremental backup (using 'accurate = no') failed to
recognize the missing data, but an incremental estimated with 'accurate =
yes' appears that it would have correctly added the missing data to the
backup chain. I intend to switch to using 'accurate = yes'. I was testing
to see if it was important, and I would say that this test shows that it
certainly could have resulted in a better outcome in this case (at least a
subsequent incremental backup would capture the data missing from the full
backup).

This is all on a test system where the data doesn't matter and all the
database and volumes are going to be erased before production. *I would
like insights into why this happened, and more importantly, how to avoid it
in the future.* I do wonder if the whole problem was kicked off because
bacula didn't have time to properly quantify the client's backup state
before the scheduled incremental backup started, but I can't imagine how to
avoid such a problem occurring again if a backup runs long enough. I have
had backups run for WEEKS and haven't had this problem before, on a linux
system.

log for the first backup job (I know about the error regarding G:\ drive,
this might be reasonable - It's google drive mounting at G:\)
https://pastebin.com/rSx7cQW6

Log for the second job, erroneously upgraded to full with an incomplete
state and status T
https://pastebin.com/ncLu3jHs

Akita configuration and related items
https://pastebin.com/ae1jLgpN

akita-fd.trace from akita c:\program files\bacula\working
https://pastebin.com/hQuEJMt7

I note that many of the items listed as having issues are in c:\backups
regarding a files and folders dump from a previous windows machine
(minerva-w10). I briefly wondered if maybe the issue is because there is a
parallel windows system in that folder, but I see also that it failed to
back up an mrimg image file containing that same system. the system should
certainly be unaware of the contents of the image file. There were also
issues with a number of other, unrelated files and folders, including
c:\games and c:\boot. Of course, the first full backup had no issues with
any of this.

I would welcome input from anyone with knowledge about this strange
behavior. Thank you in advance.

Regards,
Robert Gerber
402-237-8692
r...@craeon.net
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to