Hi Greg,
Yes, this was caused by a chain of event. As a cautionary tale, the main
ones were:
1) minor nautilus release upgrade, followed by a rolling node restart
script that mistakenly relied on "ceph -s" for cluster health info,
i.e. it didn't wait for the cluster to return to health bef
On Wed, Mar 25, 2020 at 5:19 AM Jake Grimmett wrote:
>
> Dear All,
>
> We are "in a bit of a pickle"...
>
> No reply to my message (23/03/2020), subject "OSD: FAILED
> ceph_assert(clone_size.count(clone))"
>
> So I'm presuming it's not possible to recover the crashed OSD
From your later email i
Hi Eugen,
Many thanks for your reply.
The other two OSD's are up and running, and being used by other pgs with
no problem, for some reason this pg refuses to use these OSD's.
The other two OSDs that are missing from this pg crashed at different
times last month, each OSD crashed when we trie
Hi,
is there any chance to recover the other failing OSDs that seem to
have one chunk of this PG? Do the other OSDs fail with the same error?
Zitat von Jake Grimmett :
Dear All,
We are "in a bit of a pickle"...
No reply to my message (23/03/2020), subject "OSD: FAILED
ceph_assert(clo