Hello, This feature has been planned from the very first day that a line of code was written for Bacula. It is listed in the "projects" file but still has not found a developer.
Regards, Kern On Wednesday 22 November 2006 21:33, Chris Hoogendyk wrote: > > Kern Sibbald wrote: > > On Wednesday 22 November 2006 17:54, Alan Brown wrote: > > > >> The only package I'm aware of which doesn't rely on OS timestamps uses a > >> database to keep snapshots of the filesystem state (and is quite > >> expensive) > >> > >> > >> Bacula has an extensive database, why not USE it? > >> > > > > This is no problem. It has been planned. This database is already used to > > detect changed files for Verify. It works very well. > > > > > >> ================== > >> > >> > >> The fundamental problem with almost all backup software is an assumption > >> that file timestamps will only ever increase, never decrease. While this > >> is right most of the time, it's not right all the time. > >> > >> > >> > >> The current problem is that Bacula (in common with almost all other backup > >> software) only looks for mtime or ctime higher than the last backup start > >> time. > >> > >> If for whatever reason a file appears with Mtime and ctime PRIOR to the > >> last backup start time (eg rsynced in, or last backup was of an unattached > >> filesystem stub...), it won't get looked at. > >> > >> These can be taken care of by comparing the current filesystem > >> state with the last known filesystem state in the database and > >> backing up files which have appeared regardless of timertamp. > >> > >> Once you start doing this, it is possible to note if a file has gone > >> missing and flag it as deleted in the database - that takes care of > >> restores bringing back zombie files, effectively giving "snapshot" > >> restoration without having to replicate the entire database. > >> > >> > >> Another worse-case scenario: > >> > >> An attacker replaces a system-critical file with another of the same size > >> and tweaks ctime+mtime. This won't be backed up, even if the file is on a > >> different inode (some versions of file replacement may not change the > >> inode being used, some filesystems (reiser) do not store inode > >> information. This means there is no way of telling when the file was > >> changed and renders all backups suspect. > >> > >> > >> The only way to deal with this is checksumming the file vs > >> stored file checksums and that is computationally expensive. > >> > >> > > I would really like to see this implemented. All the mechanisms already > > exist, it is just a matter of implementing them. > > > > Best regards, > > > > Kern > > > > Kern Sibbald wrote: > > On Wednesday 22 November 2006 17:54, Alan Brown wrote: > > > >> The only package I'm aware of which doesn't rely on OS timestamps uses a > >> database to keep snapshots of the filesystem state (and is quite > >> expensive) > >> > >> > >> Bacula has an extensive database, why not USE it? > >> > > > > This is no problem. It has been planned. This database is already used to > > detect changed files for Verify. It works very well. > > > > > >> ================== > >> > >> > >> The fundamental problem with almost all backup software is an assumption > >> that file timestamps will only ever increase, never decrease. While this > >> is right most of the time, it's not right all the time. > >> > >> > >> > >> The current problem is that Bacula (in common with almost all other backup > >> software) only looks for mtime or ctime higher than the last backup start > >> time. > >> > >> If for whatever reason a file appears with Mtime and ctime PRIOR to the > >> last backup start time (eg rsynced in, or last backup was of an unattached > >> filesystem stub...), it won't get looked at. > >> > >> These can be taken care of by comparing the current filesystem > >> state with the last known filesystem state in the database and > >> backing up files which have appeared regardless of timertamp. > >> > >> Once you start doing this, it is possible to note if a file has gone > >> missing and flag it as deleted in the database - that takes care of > >> restores bringing back zombie files, effectively giving "snapshot" > >> restoration without having to replicate the entire database. > >> > >> > >> Another worse-case scenario: > >> > >> An attacker replaces a system-critical file with another of the same size > >> and tweaks ctime+mtime. This won't be backed up, even if the file is on a > >> different inode (some versions of file replacement may not change the > >> inode being used, some filesystems (reiser) do not store inode > >> information. This means there is no way of telling when the file was > >> changed and renders all backups suspect. > >> > >> > >> The only way to deal with this is checksumming the file vs > >> stored file checksums and that is computationally expensive. > >> > >> > > I would really like to see this implemented. All the mechanisms already > > exist, it is just a matter of implementing them. > > > > Best regards, > > > > Kern > > I've been following the list for a while, but haven't commented on > anything yet. > > Retrospect had an awesome reputation in the Mac world for over a decade. > One of the things that made them really good was their ability to > recognize duplicate files (even on different machines within the same > backup set), added files, removed files, etc. Backing up a bunch of > machines over the network started a little slow on the first and then > picked up momentum and compactness as the backup proceeded. For example, > the first Mac OS 7.1 machine would back up everything. The second would > end up saying "I already have that file" many times, would back up > faster because it would skip the file copy over the network, and would > end up much smaller on the tape, because it would have a bunch of links > and only the full files unique to that computer. > > Their reputation didn't fare as well as they expanded into Windows and > Linux, had to deal with Mac OS X, and then got bought up by EMC. They > may still be a worthwhile example to look at. When I talked to them in > the 90's, they said they were using a bunch of information to identify > what needed to be backed up. I don't know if they were using checksums > or what. > > The client/server interaction becomes more complex. Instead of the > client deciding what needed to be backed up based on a backup level > passed to it from the server, the client would send information about > all the files to the server and the server would have to respond back > with what to backup based on comparing that against the database. > > --------------- > > Chris Hoogendyk > > - > O__ ---- Systems Administrator > c/ /'_ --- Biology & Geology Departments > (*) \(*) -- 140 Morrill Science Center > ~~~~~~~~~~ - University of Massachusetts, Amherst > > <[EMAIL PROTECTED]> > > --------------- > > Erdös 4 > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users