I have a product I need to backup data for. It stores data in a "contentstore" which is structured in a way that make it very predictable what needs to be backed up every day. Let me explain, every time a content is created it is stored under this specific hierarchy:
contentstore_root |_YYYY (e.g. 2019) |_mm (1...12) |_dd (1...31) |_HH (1...24) |_MM (1...59) This directory structure is stored on a GFS filesystem which has proven to be very poor at traversing long list of files (caveats of all distributed filesystems I guess). For example doing a `ls -lR` on the GFS mountpoint is significantly slower than on local extX FS. And this of course has an impact on the time taken for incremental backups... to the point where backing up the contentstore root (which containd tens of millions of files) would take more than half a day. Also it should be noted that when a file is updated on the system, the old version stays where it was, and the new version is stored has a new file using the "new date directory path". When a file is deleted it remains on the filesystem for a while before it is moved to a trash folder by an internal job of the application after a "grace period", As a consequence, given the directory structure and the general behaviour of the application, I don't feel like incremental backups are really needed, and Id like to get rid of them if possible to avoid those long backup I have. However all the configuration I could come up with have huge drawbacks in some way or another. For example daily backup of the folder-of-the-past-day + monthly backup of the fodler of the past month.... makes it very hard and complex to do a full restore in case of disaster recovery.... I'm really interested in knowing if anybody is dealing with a similar kind of backup and how do they deal with it? Also if you have any idea on what features coud be helpful in my case (I took a look at the VirtualFull, ut that doesn't seem to really solve the burden of backup administration).... in shoort, any insight is appreciated. I'd like to avoid as much as possible having to deal with local (tar) archiving as we are here talking about tens of TB of data and can't really afford having twice the space used in order to be able to back it up. Regards
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users