Hi,

On 12.06.25 21:58, Daniel Vogelbacher wrote:
Hi Eric,

On 6/12/25 17:33, Eric Le Lay wrote:
I use rsync to copy data (~10TB) to backup storage.

To speed things up I use the ceph.dir.rctime extended attribute to instantly ignore sub-trees that haven't changed without iterating through their contents. I have to maintain the ceph.dir.rctime value between backups: I just keep it in a file per top-level directory on the target storage.
This sounds interesting. Can you give some advice how to set this up with rsync?

the rctime attribute is useful, but I wouldn't rely on it. As far as I know it is stored in a directory inode, so each operation on a file or directory will update the rctime on all path elements (not sure whether this happens synchronously or asynchronously).


The problem is the fact that it is just a single value. Imagine one rogue user or rogue host that touches a file in a subdirectory, set the ctime to 01/01/2300, and then removes the file. Although the removal is the last operation, setting the ctime will also update the rctime of all path elements. And the removal if the file cannot revert this. So your backup check will detect a last change in 01/01/2300 for the subtree and probably performs a complete rsync. Even if all files are still the same.


rctime is fine for controlled environments without rogue elements (== no users... ;-) ). And it can definitely be used to skip subtrees. But the check can easily be rendered useless.


Best regards,

Burkhard



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to