[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

Ilya Dryomov Mon, 12 Aug 2024 02:10:48 -0700

On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth
<freyerm...@physik.uni-bonn.de> wrote:
>
> Dear Cephalopodians,
>
> we've successfully operated a "good old" Mimic cluster with primary RBD 
> images, replicated via journaling to a "backup cluster" with Octopus, for the 
> past years (i.e. one-way replication).
> We've now finally gotten around upgrading the cluster with the primary images 
> to Octopus (and plan to upgrade further in the near future).
>
> After the upgrade, all MON+MGR-OSD+rbd_mirror daemons are running 15.2.17.
>
> We run three rbd-mirror daemons which all share the following client with 
> auth in the "backup" cluster, to which they write:
>
>    client.rbd_mirror_backup
>          caps: [mon] profile rbd-mirror
>          caps: [osd] profile rbd
>
> and the following shared client with auth in the "primary cluster" from which 
> they are reading:
>
>    client.rbd_mirror
>          caps: [mon] profile rbd
>          caps: [osd] profile rbd
>
> i.e. the same auth as described in the docs[0].
>
> Checking on the primary cluster, we get:
>
> # rbd mirror pool status
>    health: UNKNOWN
>    daemon health: UNKNOWN
>    image health: OK
>    images: 288 total
>        288 replaying
>
> For some reason, some values are "unknown" here. But mirroring seems to work, 
> as checking on the backup cluster reveals, see for example:
>
>    # rbd mirror image status zabbix-test.example.com-disk2
>      zabbix-test.example.com-disk2:
>      global_id:   1bdcb981-c1c5-4172-9583-be6a6cd996ec
>      state:       up+replaying
>      description: replaying, 
> {"bytes_per_second":8540.27,"entries_behind_primary":0,"entries_per_second":1.8,"non_primary_position":{"entry_tid":869176,"object_number":504,"tag_tid":1},"primary_position":{"entry_tid":11143,"object_number":7,"tag_tid":1}}
>      service:     rbd_mirror_backup on rbd-mirror002.example.com
>      last_update: 2024-08-12 09:53:17
>
> However, we do in some seemingly random cases see that journals are never 
> advanced on the primary cluster — staying with the example above, on the 
> primary cluster I find the following:
>
>    # rbd journal status --image zabbix-test.physik.uni-bonn.de-disk2
>    minimum_set: 1
>    active_set: 126
>      registered clients:
>            [id=, commit_position=[positions=[[object_number=7, tag_tid=1, 
> entry_tid=11143], [object_number=6, tag_tid=1, entry_tid=11142], 
> [object_number=5, tag_tid=1, entry_tid=11141], [object_number=4, tag_tid=1, 
> entry_tid=11140]]], state=connected]
>            [id=52b80bb0-a090-4f7d-9950-c8691ed8fee9, 
> commit_position=[positions=[[object_number=505, tag_tid=1, entry_tid=869181], 
> [object_number=504, tag_tid=1, entry_tid=869180], [object_number=507, 
> tag_tid=1, entry_tid=869179], [object_number=506, tag_tid=1, 
> entry_tid=869178]]], state=connected]
>
> As you can see, the minimum_set was not advanced. As can be seen in "mirror 
> image status", it shows the strange output that non_primary_position seems 
> much more advanced than primary_position. This seems to happen "at random" 
> for only a few volumes...
> There are no other active clients apart from the actual VM (libvirt+qemu).


Hi Oliver,

Were the VM clients (i.e. librbd on the hypervisor nodes) upgraded as well?

>
> As a quick fix, to purge journals piling up over and over, we've only found 
> the "solution" to temporarily disable and then re-enable journaling for 
> affected VM disks, which can be identified by:
>   for A in $(rbd ls); do echo -n "$A: "; rbd --format=json journal status 
> --image $A | jq '.active_set - .minimum_set'; done
>
>
> Any idea what is going wrong here?
> This did not happen with the primary cluster running Mimic and the backup 
> cluster running Octopus before, and also did not happen when both were 
> running Mimic.

You might be hitting https://tracker.ceph.com/issues/57396.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

Reply via email to