Just my random 2 cents.

I feel like I’ve hit the same issues when updating realm periods some years ago 
when testing
bucket replication and replication in general, it makes me sad that it’s still 
an issue I remember
having to recreate things multiple times when hitting this.

I was hoping this was not a problem this far in the future, if I were to need 
to change
our realms/zonegroups/add zones etc today I would probably backup the .rgw.root
pool to have some kind of possibility to revert the pool, even though that 
might be
bad as well.

/Tobias

> On 21 May 2025, at 15:16, Michel Jouvin <michel.jou...@ijclab.in2p3.fr> wrote:
>
> Hi,
>
> An update on this issue. Thanks to suggestions from Frédéric Nass, I think I 
> managed to clear the problem by deleting the realm and all its objects 
> (zonegroup, zone, period) with radosgw-admin and deleting the pools 
> associated with the deleted zone. I am sure it is not a general solution for 
> this problem that I was able to reproduce on  a test cluster. I've the 
> feeling that radosgw-admin should make a better job to avoid creating such a 
> mess when deleting zones but it is another story. The reasons why deleting 
> the realm and its objects worked for us include:
>
> - The realm/zonegroup/zone was just created and there was no useful content 
> in it so loosing everything related to it was an option as said previously 
> (but deleting .rgw.root was not an option as we have several realms in 
> production).
>
> - We configure each realm/zonegroup/zone with a separate set of RGW (that can 
> be deployed on the same server by cephadm but it is another story) so the 
> only RGW impacts are those related to the deleted realm.
>
> - Our realm was monosite. After deleting the realm, it is not possible to 
> push (commit) the change to other zonegroup/zones of the realm as the realm 
> must exist to be able to commit a new period. I guess that in a multisite 
> configuration, it means that the cleanup operation must be done in all the 
> clusters involved in the multisite configuration.
>
> Best regards,
>
> Michel
>
> Le 14/05/2025 à 18:12, Michel Jouvin a écrit :
>> Hi,
>>
>> We are still stucked with this problem and I have not seen an answer to my 
>> previous emails. We found in the doc the explanation of the problem: 
>> https://docs.ceph.com/en/latest/radosgw/multisite/#deleting-a-zone. But the 
>> doc does not mention the way out of the problem... If we delete the realm 
>> would it help? There is no content in this realm/zonegroup/zone so removing 
>> everything is an option if it helps.
>>
>> Thanks in advance for any hint. Best regards,
>>
>> Michel
>> Sent from my mobile
>>
>> Le 7 mai 2025 16:49:19 Michel Jouvin <michel.jou...@ijclab.in2p3.fr> a écrit 
>> :
>>
>>> Hi,
>>>
>>> I managed to find what where the zone and zonegroup ID before they were
>>> deleted and I confirm that those referred into the error messages are
>>> the ID of the deleted zone and zonegroup. The new zone and zonegroup
>>> (which have the same name, again not sure if it is a problem as
>>> everything should be done by ID, isn't it) have been defined as master
>>> zone and zonegroup, so the other ones should just be deleted, isn't it?
>>> I really don't understand what the error means and what can be done to
>>> fix it.
>>>
>>> Best regards,
>>>
>>> Michel
>>>
>>> Le 06/05/2025 à 21:29, Michel Jouvin a écrit :
>>>> Hi,
>>>>
>>>> It is not the first time that after doing configuration changes in
>>>> RADOS for a realm/zonegroup/zone with radosgw-admin, we get errors
>>>> when trying to do a "period update --commit". We never found a good
>>>> documentation on how to fix these problems, up to now we always
>>>> managed at some point to restore a good configuration that can be
>>>> commited but it is probably time for us to have a more informed approach!
>>>>
>>>> Last occurence of the problem happened today with a
>>>> realm/zonegroup/zone created recently. Trying to fix a problem with
>>>> the non working haproxy associated with it, one of my colleagues
>>>> decided to delete and recreate the zone and zonegroup (with the same
>>>> names). The related commands worked but since it has been done any
>>>> attempt to do "period update --commit" results in the following error:
>>>>
>>>> -------
>>>>
>>>> 2025-05-06T11:56:20.939+0200 7fdc7d41da80 0 failed reading obj info
>>>> from .rgw.root:zone_info.93af6e0c-4552-4c2e-b167-36114a5a81e4: (2) No
>>>> such file or directory
>>>> 2025-05-06T11:56:20.945+0200 7fdc7d41da80 0 failed reading obj info
>>>> from .rgw.root:zonegroup_info.d7221099-4e7d-43cb-a1e8-28a750de1cd5:
>>>> (2) No such file or directory
>>>> 2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 failed reading obj info
>>>> from .rgw.root:zone_info.93af6e0c-4552-4c2e-b167-36114a5a81e4: (2) No
>>>> such file or directory
>>>> 2025-05-06T11:56:21.160+0200 7fdc7d41da80 -1 Cannot find zone
>>>> id=93af6e0c-4552-4c2e-b167-36114a5a81e4 (name=default)
>>>> 2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 ERROR: failed to start
>>>> notify service ((22) Invalid argument
>>>> 2025-05-06T11:56:21.160+0200 7fdc7d41da80 0 ERROR: failed to init
>>>> services (ret=(22) Invalid argument)
>>>> couldn't init storage provider
>>>> -------
>>>>
>>>> I have the feeling that it is related to the delete objects that are
>>>> no longer found but it is not completely clear what is the way out of
>>>> it? Is the problem related to recreating the zone/zonegroup with the
>>>> same names? There are several realms already in production so we
>>>> cannot do a .rgw.root reset but this particular realm has never been
>>>> put in production so we can delete everything related to it.
>>>>
>>>> Thanks in advance for any hint or pointer. Best regards,
>>>>
>>>> Michel
>>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to