Hello, A few very large files don't appear to have changed since the last full backup, yet are being backed up repeatedly by incremental / differential backups. Why?
Details: I am using bacula to backup a fileserver for one of my customers. For the primary share, I am using a full/diff/inc backup cycle. I have a few very large files that won't change (system images for user PCs). I didn't want to use up my storage backing up those system images multiple times, so I set up a simpler backup cycle that would take a full once, then a diff and inc periodically thereafter. The expectation was that the files wouldn't change and we'd rarely add files, so most backups would be 0 files / 0 bytes. The user's PC files are routinely backed up by a separate service. The problem is that the system images have been backed up 3 times during the last 6 months. I picked one of the system image files, and ran an SQL query for it against the database. The results indicated that the target file was backed up 3 times. The file hashes were the same each time, but the lstat column had some differences. Unfortunately, the metadata is encoded somehow and I dont understand what changed. I ran sha512sum against the file in question, but its output was different than that found in the bacula database. I used python to base64 decode the catalog hash, and when the output is displayed in hex it matches the hash provided by sha512sum. So the file appears to be unchanged from its first time being backed up, yet has been backed up 3 times. [root@stallone macrium]# sha512sum A6A87E2D36FAE06E-Mike\ W11-00-00.mrimg dc396719178be8a34c028f32475d84d53c42cd5c4de63bed15154154eed258ebb010bf0a366f525c4d5cc04d2d41653806f57689a1f1fdb42ce8b27500e4037f A6A87E2D36FAE06E-Mike W11-00-00.mrimg The file was backed up 3 times. Jobid 70: full backup. Jobid 642: incremental backup Jobid 667: differential backup. It makes sense that a differential backup after an incremental backup would contain the contents of the incremental. However, the bacula database has the same hash for each of the 3 backups. Doesn't that imply that the file hasn't changed? I don't know why it was backed up again in the incremental backup. Please note that there were many incremental or differential backups ran daily that didn't see any changes or needs to back up any files in this dataset - except for these few times. My query is in the pastebin link (I don't know if the length of the query will mess up someone's email client). https://pastebin.com/6PCbZ21s Here is the metadata section from the query: | lstat | +----------------------------------------------------------------------- | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmB6ge BmB6gdBmB6ge A A C | | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BmNcn/ BmB6gdBnQLI1 A A C | | P0j TaAhK IHw B Pu Pw A MQ2A6aB BAA BiGwPA BnQXpa BmB6gdBnQLI1 A A C | Here is the output of 'stat' [root@stallone macrium]# stat A6A87E2D36FAE06E-Mike\ W11-00-00.mrimg File: A6A87E2D36FAE06E-Mike W11-00-00.mrimg Size: 842719798913 Blocks: 1645937600 IO Block: 4096 regular file Device: fd23h/64803d Inode: 325584970 Links: 1 Access: (0760/-rwxrw----) Uid: ( 1006/ mike) Gid: ( 1008/hermanstaff) Context: system_u:object_r:samba_share_t:s0 Access: 2024-12-12 08:44:43.737875113 -0600 Modify: 2024-03-30 00:50:21.816717186 -0500 Change: 2024-11-22 10:32:53.237897204 -0600 Birth: 2024-03-29 15:51:12.128951779 -0500 Here is the relevant fileset: FileSet { Name = "Stallone_macrium_backups" Include { Options { signature = SHA512 Verify = s3 } File = /mnt/data/shares/cncshare/Craeon/Backup/macrium } Exclude { } } If there is something else I can provide, please let me know. Regards, Robert Gerber 402-237-8692 r...@craeon.net
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users