Hi

I am running a multi-region Ceph cluster with version 18.2.6. Our system
has been operating reliably for a long time and currently holds over
100,000 buckets and approximately 1.5 billion objects.

We recently attempted to add a new zone to the primary region. To control
the synchronization scope, we globally disabled bucket sync using sync
disable on all buckets, intending to enable sync selectively only for
specific buckets in a bidirectional manner.

A Lua script is in place and functioning correctly, issuing a sync disable
request via POST whenever a new bucket is created.

Despite this setup, we’ve observed that once the new zone was added, data
synchronization began across all buckets, including those where sync was
explicitly disabled. Upon investigation, we found that a large number of
deleted buckets still have existing bucket instances on the master zone
with sync flag set to 0. For example, while we have ~100k active buckets,
we have over 150k bucket instances.

To mitigate this, we manually updated all stale bucket instances using
radosgw-admin metadata put to set the sync flag to 8. The change is
reflected correctly on the secondary zone, and the metadata synchronization
is working as expected. However, the data is still being fully synced for
those deleted buckets.

An example scenario:
• A bucket x was previously created with bucket ID 123456 and later deleted.
• A new bucket x is created by a user with a different ID 654321.
• Both bucket instances exist: x:123456 and x:654321.
• During data sync, it appears the system tries to replicate bucket x
regardless of which bucket ID is associated with it.

We also attempted to clean up logs by deleting all relevant data sync and
full sync logs from the logging pool, and performed datalog trim, but the
issue persists.

I have two main questions:
1. Why is full data sync still occurring for bucket instances with sync
flag set to 8?
2. What is the recommended and safe approach to remove old/stale bucket
instances without risking data loss for users?

Currently, the sync configuration is set to replicate data between all
zones (sync from all), with "log_meta": false and "log_data": true.

Looking forward to your insights. Please let me know if further details or
logs are needed.

Best regards,
Ramin
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to