On 2020-12-11 12:53, Chris Green wrote :
[…] wrote a trivial[ish] script that copied
all the backups to a new destination sequentially (using --link-dest)
and then removed the original tree, having checked the new backups
were OK of course.
With the same cause as yours, I once worked out exactly the same
solution.
But then, having to automate it, I worked a bit more on it, and ended
up having a shell script that:
- recursively listed files as "file size - inode - path"
- with sort and awk, output the list of "every size that has different
inodes"
- for each output size, cksumed one file for each inode
- if two different inodes (with the same file size) had their cksum
match, then it replaced every file for the last inode, with a link to
the first inode
If you have to run it frequently, you may want to implement something
similar.
Although it ignores mtime info (and thus strips it when lning),
it has the great benefit of finding every duplicate, be it renamed and
move to another dir
(as in
./her.2020-12-01/Library/Mail/…/Sent.mbox/…/Attachments/…/PhotoDeFamille.JPG
versus ./his.2020-11-26/perso/photos/100_9999.JPG).
(and by the way I reimplemented it in C, "just for fun" and for speed
too: https://github.com/outtersg/dude/ . Hmm, in C but in French)
--
Guillaume
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html