On 2020-12-11 12:53, Chris Green wrote :

[…] wrote a trivial[ish] script that copied
all the backups to a new destination sequentially (using --link-dest)
and then removed the original tree, having checked the new backups
were OK of course.

With the same cause as yours, I once worked out exactly the same solution.

But then, having to automate it, I worked a bit more on it, and ended up having a shell script that:
- recursively listed files as "file size - inode - path"
- with sort and awk, output the list of "every size that has different inodes"
- for each output size, cksumed one file for each inode
- if two different inodes (with the same file size) had their cksum match, then it replaced every file for the last inode, with a link to the first inode

If you have to run it frequently, you may want to implement something similar.
Although it ignores mtime info (and thus strips it when lning),
it has the great benefit of finding every duplicate, be it renamed and move to another dir (as in ./her.2020-12-01/Library/Mail/…/Sent.mbox/…/Attachments/…/PhotoDeFamille.JPG versus ./his.2020-11-26/perso/photos/100_9999.JPG).

(and by the way I reimplemented it in C, "just for fun" and for speed too: https://github.com/outtersg/dude/ . Hmm, in C but in French)

--
Guillaume

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to