On 12/12/24 3:28 PM, Rob Gerber wrote:
Hello,A few very large files don't appear to have changed since the last full backup, yet are being backed up repeatedly by incremental / differential backups. Why?Details:I am using bacula to backup a fileserver for one of my customers. For the primary share, I am using a full/diff/inc backup cycle. I have a few very large files that won't change (system images for user PCs). I didn't want to use up my storage backing up those system images multiple times, so I set up a simpler backup cycle that would take a full once, then a diff and inc periodically thereafter. The expectation was that the files wouldn't change and we'd rarely add files, so most backups would be 0 files / 0 bytes. The user's PC files are routinely backed up by a separate service.The problem is that the system images have been backed up 3 times during the last 6 months. I picked one of the system image files, and ran an SQL query for it against the database. The results indicated that the target file was backed up 3 times. The file hashes were the same each time, but the lstat column had some differences. Unfortunately, the metadata is encoded somehow and I dont understand what changed. I ran sha512sum against the file in question, but its output was different than that found in the bacula database. I used python to base64 decode the catalog hash, and when the output is displayed in hex it matches the hash provided by sha512sum. So the file appears to be unchanged from its first time being backed up, yet has been backed up 3 times.
[...snip...]
The file was backed up 3 times. Jobid 70: full backup. Jobid 642: incremental backup Jobid 667: differential backup.It makes sense that a differential backup after an incremental backup would contain the contents of the incremental. However, the bacula database has the same hash for each of the 3 backups. Doesn't that imply that the file hasn't changed? I don't know why it was backed up again in the incremental backup. Please note that there were many incremental or differential backups ran daily that didn't see any changes or needs to back up any files in this dataset - except for these few times.My query is in the pastebin link (I don't know if the length of the query will mess up someone's email client). https://pastebin.com/6PCbZ21s <https://pastebin.com/6PCbZ21s> Here is the metadata section from the query: | lstat | +----------------------------------------------------------------------- | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmB6ge BmB6gdBmB6ge A A C | | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmNcn/ BmB6gdBnQLI1 A A C | | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BnQXpa BmB6gdBnQLI1 A A C |
[...snip...]
Here is the relevant fileset: FileSet { Name = "Stallone_macrium_backups" Include { Options { signature = SHA512 Verify = s3 } File = /mnt/data/shares/cncshare/Craeon/Backup/macrium } Exclude { } } If there is something else I can provide, please let me know. Regards, Robert Gerber
Hello Rob, For the first two, we can see that the lstat decoded tells us that the mtime and atime have changed: ----8<---- *.bvfs_decode_lstat lstat="P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmB6ge BmB6gdBmB6ge A A C" st_nlink=1 st_mode=33264 perm=-rwxrw---- st_uid=1006 st_gid=1008 st_size=842719798913 st_blocks=1645937600 st_ino=325584970 st_ctime=0 st_mtime=6952011706864740382 <---- st_atime=1711777822 <---- st_dev=64803 LinkFI=0 *.bvfs_decode_lstat lstat="P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmNcn/ BmB6gdBnQLI1 A A C" st_nlink=1 st_mode=33264 perm=-rwxrw---- st_uid=1006 st_gid=1008 st_size=842719798913 st_blocks=1645937600 st_ino=325584970 st_ctime=0 st_mtime=6952011706885255733 <---- More recent time st_atime=1714801151 <---- More recent time st_dev=64803 LinkFI=0 ----8<---- So the second one was backed up because it was indeed modified. But for the third one the only thing I see that changed was the atime: ----8<---- *.bvfs_decode_lstat lstat="P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BnQXpa BmB6gdBnQLI1 A A C" st_nlink=1 st_mode=33264 perm=-rwxrw---- st_uid=1006 st_gid=1008 st_size=842719798913 st_blocks=1645937600 st_ino=325584970 st_ctime=0 st_mtime=6952011706885255733 st_atime=1732344410 <---- More recent atime than the previous one, should not matter, not normally tested :) st_dev=64803 LinkFI=0 ----8<----So, this looks correct. It gets backed by the full (jobid 70), then there is a change and it gets backed by the Inc (jobid 642), then of course it will get backed bu the subsequent differential (jobid 667)
If you run another incremental, this file should not get backed up (unless of course it gets modified since the Differential :)I would double check that the jobids were actually run at the levels we think, and in the order we think. So, it seems Full, Inc, Diff to me here, and that would explain the three backups with the levels run in that order with a modification occurring before the Inc.
It's been a long day, so I may have this wrong. :) Hope this helps, Bill -- Bill Arlofski w...@protonmail.com
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users