[ceph-users] Re: Which version of Ceph fully supports CephFS Snapshot?

2021-01-13 Thread Wido den Hollander
In addition: Make sure you are using kernels with the proper fixes. CephFS is a co-operation between the MDS, OSDs and (Kernel) clients. If the clients are outdated they can cause all kinds of troubles. So make sure you are able to update clients to recent versions. Although a stock CentOS or Ub

[ceph-users] How to reset an OSD

2021-01-13 Thread Pfannes, Fabian
We running a small Ceph cluster with two nodes. Our failureDomain is set to host to have the data replicated between the two hosts. The other night one host crashed hard and three OSDs won't recovert with either debug 2021-01-13T08:13:17.855+ 7f9bfbd6ef40 -1 osd.23 0 OSD::init() : unable to r

[ceph-users] Re: How to reset an OSD

2021-01-13 Thread Andreas John
Hello, I suspect there was unwritten data in RAM which didn't make it to the disk. This shoudn't happen, that's why the journal is in place. If you have size=2 in you pool, there is one copy on the other host. Do delete the OSD you could probably do ceph osd crush remove osd.x ceph osd rm osd.x

[ceph-users] OSDs in pool full : can't restart to clean

2021-01-13 Thread Paul Mezzanini
Hey all We landed in a bad place (tm) with our nvme metadata tier. I'll root cause how we got here after it's all back up. I suspect it was a pool got misconfigured and just filled it all up. Short version, the OSDs are all full (or full enough) that I can't get them to spin back up. They c

[ceph-users] Re: OSDs in pool full : can't restart to clean

2021-01-13 Thread Paul Mezzanini
We may have found a way out of the jam. ceph-bluestore-tool's bluefs-bdev-migrate is successfully getting data moved into another LV and then we can manually start the OSDs to get the captive PGs out. It is not a fix I would trust beyond getting out of jail and I completely plan on blowing awa