You're asking du to report each directory separately, which is deceiving
you.  Run "du -hsc /backup/websites" instead.

When I run that, it shows that there truly are two complete copies:

$ du -hsc /backup/websites
4.2G    /backup/websites
4.2G    total

I'm fairly certain du is intelligent enough to recognize hard links like this and report the correct usage. (I'm using GNU du.) In any event, if what you say were the case, why does du correctly report the disk usage on the smaller directory tree?

Secondly, rsync *is* pulling down all the files again. When I manually rsync the initial copy without using --link-dest, it takes 35 seconds to sync the day's changes. When I use --link-dest though, it takes the full 3.5 hours and more than 2G of network traffic gets received. Examining the verbose output, rsync does not find any files "uptodate".

After playing with this some more today, I've noticed rsync will occasionally but very rarely work correctly on the large tree. As far as I can tell it's doing so completely randomly, and I can't find pattern or salient factor that changes from one run to the other that makes it re-pull everything. I can sometimes run it two or three times in a row, and it works exactly like it should, and then on the subsequent attempt it will re-pull the entire tree. Like I said, though, the majority of the time it is re-pulling. Why is it doing this?
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to