Zitat von Eugen Block <[email protected]>:
> But one more question on this: why is it allowed to remove an image
> from the group if there are existing snapshots? Shouldn't this be
> prevented to keep the group consistency?
>
> And just for my understanding: how are the group snapshots
> technically created? Is that one snapshot for all images or is it an
> individual snapshot per image?
>
> Zitat von Eugen Block <[email protected]>:
>
>> I understand, you're right about the group consistency of course. I
>> just thought if you can remove an image from the group, it would
>> also remove the image's snapshot(s) from the list of snapshots as
>> well. My scenario is: initially I thought it would make sense to
>> have those servers in a group because if I wanted to rollback, it
>> would make sense to do it for all. But then I thought a bit more
>> about it and decided that one of the images actually doesn't make
>> sense to be in that group. Re-adding it will cause more problems in
>> case of rollback... I need to think about this...
>>
>> Thanks a lot for taking the time, I really appreciate it!
>>
>> Zitat von Ilya Dryomov <[email protected]>:
>>
>>> On Wed, Jun 3, 2026 at 10:27 AM Eugen Block <[email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> that is correct, log_to_stderr is false in our cluster. And with
>>>> --log-to-stderr true the result is a you expected:
>>>>
>>>> rbd: rollback group to snapshot failed: 2026-06-03T08:06:52.930+0000
>>>> 7f53896ae0c0 -1 librbd::api::Group: snap_rollback: group snapshot
>>>> membership does not match group membership
>>>>
>>>> But what's the conclusion here? So it's not allowed to rollback if
>>>> memberships don't match. How would I correct the membership?
>>>
>>> Re-add the image back to the group if the image is still around.
>>>
>>>> Because I
>>>> wouldn't want to delete all snapshots from before I removed one image
>>>> from the group. Is there any workaround?
>>>
>>> I'm not sure I see what needs to be worked around here. The group is
>>> supposed to be a logical collection of images where some level of
>>> consistency between images is required, not a random "bag". This
>>> suggests that while images can come and go (i.e. be added and removed
>>> from the group), the group can't always be meaningfully rolled back.
>>> For example, if a group snapshot captured images A, B and C but image
>>> C had since been removed from the group and potentially reformatted,
>>> repurposed for something else or removed altogether, the group's state
>>> exactly as of that snapshot just can't be restored.
>>>
>>> Thanks,
>>>
>>> Ilya
>>>
>>>>
>>>> Thanks,
>>>> Eugen
>>>>
>>>> Zitat von Ilya Dryomov <[email protected]>:
>>>>
>>>>> On Fri, May 29, 2026 at 6:41 PM Eugen Block <[email protected]> wrote:
>>>>>>
>>>>>> The commands were:
>>>>>>
>>>>>> controller02:~# rbd --id user group create images/test-servers
>>>>>>
>>>>>> controller02:~# for i in 0f69278e-00c2-46b0-b6e7-0b06e9c8b6fd
>>>>>> 72f5816c-c1db-44de-b0a2-19d661faa963
>>>>>> 47d6144e-0d5a-4dc7-82dd-5be3edf9f6cc; do rbd --id user group
image add
>>>>>> images/test-servers images/${i}_disk; done
>>>>>>
>>>>>> controller02:~# rbd --id user group image ls images/test-servers
>>>>>> images/0f69278e-00c2-46b0-b6e7-0b06e9c8b6fd_disk
>>>>>> images/47d6144e-0d5a-4dc7-82dd-5be3edf9f6cc_disk
>>>>>> images/72f5816c-c1db-44de-b0a2-19d661faa963_disk
>>>>>>
>>>>>> controller02:~# rbd --id user group snap create
>>>>>> images/test-servers@snap1
>>>>>>
>>>>>> controller02:~# rbd --id user group snap ls images/test-servers
>>>>>> NAME STATUS
>>>>>> snap1 ok
>>>>>>
>>>>>>
>>>>>> # rollback works for all images
>>>>>> controller02:~# rbd --id user group snap rollback
>>>>>> images/test-servers@snap1
>>>>>> Rolling back to group snapshot: 100% complete...done.
>>>>>>
>>>>>> # removing one image from the group
>>>>>> controller02:~# rbd --id user group image rm images/test-servers
>>>>>> images/0f69278e-00c2-46b0-b6e7-0b06e9c8b6fd_disk
>>>>>>
>>>>>> # rollback fails
>>>>>> controller02:~# rbd --id user group snap rollback
>>>>>> images/test-servers@snap1
>>>>>> Rolling back to group snapshot: 0% complete...failed.
>>>>>> rbd: rollback group to snapshot failed: (22) Invalid argument
>>>>>>
>>>>>> I'll add the debug output later, will need to sanitze it first. But I
>>>>>> don't see anything obvious in there.
>>>>>
>>>>> Hi Eugen,
>>>>>
>>>>> Based on the above, it's https://tracker.ceph.com/issues/66300 and is
>>>>> therefore the intended behavior. The only fly in the ointment is that
>>>>> you aren't seeing the associated "group snapshot membership does not
>>>>> match group membership" error message.
>>>>>
>>>>> You not seeing it is consistent with the attached debug output where
>>>>> only very early messenger traffic is present and nothing beyond that.
>>>>> It suggests some non-conventional settings in the cluster-wide config
>>>>> such as log_to_stderr being set to false or similar.
>>>>>
>>>>> Can you try appending --log-to-stderr true to "rbd group snap
rollback"
>>>>> command?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ilya
>>>>>
>>>>>>
>>>>>> Zitat von Ilya Dryomov <[email protected]>:
>>>>>>
>>>>>>> On Fri, May 29, 2026 at 4:05 PM Eugen Block <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> thanks for your quick reply. No I didn't see any additional output
>>>>>>>> than the one I shared (invalid argument). I could add
debug log level
>>>>>>>> if necessary.
>>>>>>>
>>>>>>> That error message should have been displayed no matter the
log level,
>>>>>>> so something other than
https://tracker.ceph.com/issues/66300 might be
>>>>>>> involved.
>>>>>>>
>>>>>>> What exactly do you mean by "I removed an image from the group
>>>>>>> snapshot"? Which commands were run there and in what order?
>>>>>>>
>>>>>>>> But one more detail, I also tried the rollback directly within the
>>>>>>>> cephadm shell (so version 19.2.3) with the same result:
>>>>>>>>
>>>>>>>> ceph03:~ # cephadm shell
>>>>>>>> ...
>>>>>>>> [ceph: root@ceph03 /]# rbd group snap rollback
>>>>>>>> images/test-servers@20260430_start
>>>>>>>> Rolling back to group snapshot: 0% complete...failed.
>>>>>>>> rbd: rollback group to snapshot failed: (22) Invalid argument
>>>>>>>>
>>>>>>>> [ceph: root@ceph03 /]# ceph -v
>>>>>>>> ceph version 19.2.3 (c92aebb279828e9c3c1f5d24613efca272649e62)
>>>>>>>> squid (stable)
>>>>>>>
>>>>>>> Can you try appending --debug-ms 1 --debug-rbd 20 to the command
>>>>>>> (let's stick to this cephadm shell) and attach the output?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Ilya
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> Eugen
>>>>>>>>
>>>>>>>> Zitat von Ilya Dryomov <[email protected]>:
>>>>>>>>
>>>>>>>> > On Fri, May 29, 2026 at 2:33 PM Eugen Block via ceph-users
>>>>>>>> > <[email protected]> wrote:
>>>>>>>> >>
>>>>>>>> >> Hi,
>>>>>>>> >>
>>>>>>>> >> I wanted to rollback a group snapshot on Ubuntu 24.04
(rbd client
>>>>>>>> >> version 19.2.1), the Ceph cluster version is 19.2.3. The
>>>>>>>> client fails
>>>>>>>> >> with "invalid argument":
>>>>>>>> >>
>>>>>>>> >> controller02:~# rbd --id <user> group snap rollback
--pool images
>>>>>>>> >> --group test-servers --snap 20260430_start
>>>>>>>> >> Rolling back to group snapshot: 0% complete...failed.
>>>>>>>> >> rbd: rollback group to snapshot failed: (22) Invalid argument
>>>>>>>> >>
>>>>>>>> >> controller02:~# ceph -v
>>>>>>>> >> ceph version 19.2.1 (9efac4a81335940925dd17dbf407bfd6d3860d28)
>>>>>>>> >> squid (stable)
>>>>>>>> >>
>>>>>>>> >> But running the same command (just as admin not as <user>)
>>>>>>>> on a Ceph
>>>>>>>> >> node works:
>>>>>>>> >>
>>>>>>>> >> ceph03:~ # rbd group snap rollback --pool images --group
>>>>>>>> test-servers
>>>>>>>> >> --snap 20260430_start
>>>>>>>> >> Rolling back to group snapshot: 100% complete...done.
>>>>>>>> >>
>>>>>>>> >> ceph03:~ # ceph -v
>>>>>>>> >> ceph version 16.2.13-66-g54799ee0666
>>>>>>>> >> (54799ee06669271880ee5fc715f99202002aa371) pacific (stable)
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> What seems to be the issue here is that I removed an
image from the
>>>>>>>> >> group snapshot. I wonder if it could be this bug [0] which
>>>>>>>> is supposed
>>>>>>>> >> to be fixed in 19.2.0 according to the "Released In"
field of the
>>>>>>>> >> Squid backport tracker [1].
>>>>>>>> >>
>>>>>>>> >> This seems a little inconsistent to me, could someone
>>>>>>>> please clarify?
>>>>>>>> >
>>>>>>>> > Hi Eugen,
>>>>>>>> >
>>>>>>>> > Did you see "group snapshot membership does not match group
>>>>>>>> membership"
>>>>>>>> > error message when the rollback command failed with
19.2.1 client?
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> >
>>>>>>>> > Ilya
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>>