[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

Arthur Outhenin-Chalandre Thu, 15 Sep 2022 06:33:45 -0700

Hi Ronny,

> On 15/09/2022 14:32 ronny.lippold <c...@spark5.de> wrote:
> hi arthur, some time went ...
> 
> i would like to know, if there are some news of your setup.
> do you have replication active running?


No, there was no change at CERN. I am switching jobs as well actually so I 
won't have much news for you on CERN infra in the future. I know other people 
from the Ceph team at CERN watch this ml so you might hear from them as well I 
guess.

> we are using actually snapshot based and had last time a move of both 
> clusters.
> after that, we had some damaged filesystems ind the kvm vms.
> did you ever had such a problems in your tests.
> 
> i think, there are not so many people, how are using ceph replication.
> for me its hard to find the right way.
> can a snapshot based ceph replication be crash consisten? i think no.

I never noticed it myself, but yes it's written on the docs actually 
https://docs.ceph.com/en/quincy/rbd/rbd-snapshot/ (but on the mirroring docs 
this is not actually explained). I never tested that super carefully though and 
thought this was more a rare occurence than anything else.

I heard a while back (maybe a year-ish ago) that there was some long term plan 
to automatically trigger an fsfreeze for librbd/qemu on a snapshot which would 
probably solve your issue (and also allow application level consistency via 
fsfreeze custom hooks). But this was apparently a tricky feature to add. I 
cc'ed Illya maybe he would know more about that or if something else could have 
caused your issue.

Cheers,

-- 
Arthur Outhenin-Chalandre
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd mirroring - journal growing and snapshot high io load

Reply via email to