Hi Michel,

Our understanding is that the bug is quite rare -- it seems to me that
there is something about the workload / IO pattern that triggers it.
I raised the alarm about this because the first cluster I upgraded to
18.2.6 had 4 OSDs crash immediately during upgrade, luckily all in the
same failure domain. (So for that cluster, we paused the upgrade and
are awaiting 18.2.7 to resume).

17.2.8 has the bug too -- there are reports about that in https://tray
cker.ceph.com/issues/69764.

My overall guess/hope -- if an OSD didn't hit the crash yet, then it's
hopefully going to survive until we get 18.2.7 out.

Cheers, Dan

On Thu, May 1, 2025 at 12:17 AM Michel Jouvin
<michel.jou...@ijclab.in2p3.fr> wrote:
>
> Hi Dan,
>
> Thanks for the advice. Is it possible to get a bit more information on what 
> is the risk of this bug? I tried to read the issues you referenced and the 
> other referenced in the Issues but I didn't get the full picture... Is it 
> just an osd crash or is there a risk of data corruption? I'm afraid it is 
> more the last one...
>
> Cheers,
>
> Michel
> Sent from my mobile
>
> Le 30 avril 2025 21:14:29 Dan van der Ster <dan.vanders...@clyso.com> a écrit 
> :
>
>> Hi Michel,
>>
>> 19.2.2 should be immune from this bug, but I don't know if 19.2.2 has
>> enough real life usage to check for other potential issues.
>>
>> We're working on an 18.2.7 hotfix now -- my personal suggestion is to
>> wait for that.
>>
>> Cheers, Dan
>>
>>
>> On Wed, Apr 30, 2025 at 11:51 AM Michel Jouvin
>> <michel.jou...@ijclab.in2p3.fr> wrote:
>>>
>>>
>>> Hi Dan,
>>>
>>> bad luck, we just upgraded today! Is it so severe that we should plan to 
>>> downgrade to 18.2.4 if feasible? Or if the issue is not present in Squid, 
>>> to upgrade to 19.2.2?
>>>
>>> Best regards,
>>>
>>> Michel
>>> Sent from my mobile
>>>
>>> Le 30 avril 2025 19:32:45 Dan van der Ster <dan.vanders...@clyso.com> a 
>>> écrit :
>>>
>>>> Hi all,
>>>>
>>>> Just a quick heads up -- 18.2.6 has a bluefs regression [1] which was
>>>> not present in 18.2.4.
>>>> Please avoid upgrading until further notice.
>>>>
>>>> Regards, Dan
>>>>
>>>> [1] Introduced in https://tracker.ceph.com/issues/65356 and fixed in
>>>> https://tracker.ceph.com/issues/69764
>>>>
>>>> --
>>>> Dan van der Ster
>>>> Ceph Executive Council | CTO @ CLYSO
>>>> https://clyso.com | dan.vanders...@clyso.com
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users@ceph.io
>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>>
>>>
>>
>>
>> --
>> Dan van der Ster
>> Ceph Executive Council | CTO @ CLYSO
>> https://clyso.com | dan.vanders...@clyso.com
>
>


-- 
Dan van der Ster
Ceph Executive Council | CTO @ CLYSO
Try our Ceph Analyzer -- https://analyzer.clyso.com/
https://clyso.com | dan.vanders...@clyso.com
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to