-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Since you are in an environment with millions of files I highly recommend that you move to ZFS storage and use ZFS's subvolume snapshots instead of --link-dest. It is much more space efficient, rsync run time efficient, and the old backups can be deleted in seconds. Rsync doesn't have to understand anything about ZFS. You just rsync to the same directory every time and have ZFS do a snapshot on that directory between runs.
On 04/06/2015 01:51 AM, Ken Chase wrote: > Feature request: allow --link-dest dir to be linked to even if file > exists in target. > > This statement from the man page is adhered to too strongly IMHO: > > "This option works best when copying into an empty destination > hierarchy, as rsync treats existing files as definitive (so it > never looks in the link-dest dirs when a destination file already > exists)". > > I was suprised by this behaviour as generally the scheme is to be > efficient/save space with rsync. > > When the file is out of date but exists in the --l-d target, it > would be great if it could be removed and linked. If an option was > supplied to request this behaviour, I'd actually throw some money > at making it happen. (And a further option to retain a copy if > inode permissions/ownership would otherwise be changed.) > > Reasoning: > > I backup many servers with --link-dest that have filesystems of > 10+M files on them. I do not delete old backups - which take 60min > per tree or more just so rsync can recreate them all in an empty > target dir when <1% of files change per day (takes 3-5 hrs per > backup!). > > Instead, I cycle them in with mv $olddate $today then rsync --del > --link-dest over them - takes 30-60 min depending. (Yes, some > malleability of permissions risk there, mostly interested in > contents tho). Problem is, if a file exists AT ALL, even out of > date, a new copy is put overtop of it per the above man page > decree. > > Thus much more disk space is used. Running this scheme with moving > old backups to be written overtop of accumulates many copies of the > exact same file over time. Running pax -rpl over the copies before > rsyncing to them works (and saves much space!), but takes a very > long time as it traverses and compares 2 large backup trees > thrashing the same device (in the order of 3-5x the rsync's time, > 3-5 hrs for pax - hardlink(1) is far worse, I suspect a some > non-linear algorithm therein - it ran 3-5x slower than pax again). > > I have detailed an example of this scenario at > > http://unix.stackexchange.com/questions/193308/rsyncs-link-dest-option-does-not-link-identical-files-if-an-old-file-exists > > which also indicates --delete-before and --whole-file do not help > at all. > > /kc > - -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. ke...@futurequest.net (work) Orlando, Florida k...@sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlUirykACgkQVKC1jlbQAQc83ACfa7lawkyPFyO9kDE/D8aztql0 AkAAoIQ970yTCHB1ypScQ8ILIQR6zphl =ktEg -----END PGP SIGNATURE----- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html