Hi all,

Just saw this thread - so a quick update on our experience here.
We've been testing Quincy 17.2.8 and a couple of weeks ago we ran into this issue (before I heard about it anywhere on the mailing list).  Our use case was cephfs on relatively full OSDs backed by NVMe.  During fairly heavily parallel tests (up to 24 client nodes, 16 way parallel per node) a few dozen OSDs crashed with this BlueFS error.  We were able to reproduce the problem at will.  We didn't see any data corruption - once the OSDs got restarted (after stopping the test), they recovered without any side effects we could see.  I ran into the tracker and the corresponding PR googling the error message. After I applied the patch in the referenced PR on top of 17.2.8 about a week ago (and rebuilt ceph-osd from source), and a solid week of testing has not produced any BlueFS issues on the same cluster - so that gives me high confidence that the patch resolves the issue.  This is all Quincy - we haven't tried Reef yet (this was one of a few surprises Quincy handed us on our path to upgrade from Pacific).

Andras


On 4/30/25 1:30 PM, Dan van der Ster wrote:
Hi all,

Just a quick heads up -- 18.2.6 has a bluefs regression [1] which was
not present in 18.2.4.
Please avoid upgrading until further notice.

Regards, Dan

[1] Introduced in https://tracker.ceph.com/issues/65356 and fixed in
https://tracker.ceph.com/issues/69764

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to