Hi,
I work with Tomasz and I'm investigating this situation. We still don't
fully understood why there was unfound object after removing single OSD.
>From logs[1] it looks like all PGs were active+clean before marking that
OSD out. After that backfills started on multiple OSDs. Three minutes
later
Hi,
Do you understand why removing that osd led to unfound objects? Do you have
the ceph.log from yesterday?
Cheers, Dan
On 2 Oct 2016 10:18, "Tomasz Kuzemko" wrote:
>
> Forgot to mention Ceph version - 0.94.5.
>
> I managed to fix this. By chance I found that when an OSD for a blocked
PG is st
Forgot to mention Ceph version - 0.94.5.
I managed to fix this. By chance I found that when an OSD for a blocked PG
is starting, there is a few-second time window (after load_pgs) in which it
accepts commands related to the blocked PG. So first I managed to capture
"ceph pg PGID query" this way. T
Hi,
I have a production cluster on which 1 OSD on a failing disk was slowing
the whole cluster down. I removed the OSD (osd.87) like usual in such case
but this time it resulted in 17 unfound objects. I no longer have the files
from osd.87. I was able to call "ceph pg PGID mark_unfound_lost delete