Hi,
On 12.06.25 21:58, Daniel Vogelbacher wrote:
Hi Eric,
On 6/12/25 17:33, Eric Le Lay wrote:
I use rsync to copy data (~10TB) to backup storage.
To speed things up I use the ceph.dir.rctime extended attribute to
instantly ignore sub-trees that haven't changed without iterating
through their contents.
I have to maintain the ceph.dir.rctime value between backups: I just
keep it in a file per top-level directory on the target storage.
This sounds interesting. Can you give some advice how to set this up
with rsync?
the rctime attribute is useful, but I wouldn't rely on it. As far as I
know it is stored in a directory inode, so each operation on a file or
directory will update the rctime on all path elements (not sure whether
this happens synchronously or asynchronously).
The problem is the fact that it is just a single value. Imagine one
rogue user or rogue host that touches a file in a subdirectory, set the
ctime to 01/01/2300, and then removes the file. Although the removal is
the last operation, setting the ctime will also update the rctime of all
path elements. And the removal if the file cannot revert this. So your
backup check will detect a last change in 01/01/2300 for the subtree and
probably performs a complete rsync. Even if all files are still the same.
rctime is fine for controlled environments without rogue elements (== no
users... ;-) ). And it can definitely be used to skip subtrees. But the
check can easily be rendered useless.
Best regards,
Burkhard
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io