Hi all,
Just saw this thread - so a quick update on our experience here.
We've been testing Quincy 17.2.8 and a couple of weeks ago we ran into
this issue (before I heard about it anywhere on the mailing list). Our
use case was cephfs on relatively full OSDs backed by NVMe. During
fairly heavily parallel tests (up to 24 client nodes, 16 way parallel
per node) a few dozen OSDs crashed with this BlueFS error. We were able
to reproduce the problem at will. We didn't see any data corruption -
once the OSDs got restarted (after stopping the test), they recovered
without any side effects we could see. I ran into the tracker and the
corresponding PR googling the error message.
After I applied the patch in the referenced PR on top of 17.2.8 about a
week ago (and rebuilt ceph-osd from source), and a solid week of testing
has not produced any BlueFS issues on the same cluster - so that gives
me high confidence that the patch resolves the issue. This is all
Quincy - we haven't tried Reef yet (this was one of a few surprises
Quincy handed us on our path to upgrade from Pacific).
Andras
On 4/30/25 1:30 PM, Dan van der Ster wrote:
Hi all,
Just a quick heads up -- 18.2.6 has a bluefs regression [1] which was
not present in 18.2.4.
Please avoid upgrading until further notice.
Regards, Dan
[1] Introduced in https://tracker.ceph.com/issues/65356 and fixed in
https://tracker.ceph.com/issues/69764
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io