Hello,

we are having issue with one of our multisite zones failing to synchronize. All replication shards are "busy", we have a lot of errors in `sync error list` on secondary zone.

We are running on 19.2.3 (cephadm ubuntu 22.04)

After some digging we have found that objects on our master zone have PENDING replication status

For example:

radosgw-admin object stat --bucket 7c52c9db-de61-4186-a3ce-e95a96df6be1 --object 0000d5d9-0a8c-40cf-88e9-30bf908c7743

    "attrs": {
        "user.rgw.amz-replication-status": "PENDING",
        "user.rgw.content_type": "application/octet-stream",

We have 3 zones in this zonegroup. Both secondary zones have this object replicated.
        "user.rgw.amz-replication-status": "REPLICA",
But on the master zone it never got marked as "COMPLETED".

Running data sync init on the bucket didn't help, we cannot fix this "PENDING" status on the ceph side without deleting and recreating affected bucket.

Has anyone ever seen similar issue?

My current theory is that something may have gone wrong during bucket resharding. I was running tests uploading small objects to a bucket and test was supposed to cause resharding mid benchmark.

Best regards
Adam Prycki
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to