This sounds also a bit like my 2nd problem here:
http://tracker.ceph.com/issues/5216
Am 31.05.2013 20:36, schrieb John Nielsen:
Possibly related:
http://tracker.ceph.com/issues/5084
I'm seeing the same big delays with peering, and when I today marked an OSD "out" then
"in" after a minute or two
Possibly related:
http://tracker.ceph.com/issues/5084
I'm seeing the same big delays with peering, and when I today marked an OSD
"out" then "in" after a minute or two it was unexpectedly marked "down". I
restarted it and 8 or so minutes later things were fine again. In the meantime
our RBD KVM
I'm not sure if the problems we are seeing are the same, but it looks like
it. Just a few hours ago, one slow OSD caused a lot of problems for us. It
is somehow reported down, and while cluster was trying to adjust, it said
it was wrongly marked down. So it seems some pgs were stuck in peering. We
Hello,
Speaking of rotating-media-under-filestore case(must be most common in
Ceph deployments), can peering be less greedy for disk operations
without slowing down entire 'blackhole timeout', e.g. when it blocks
client operations? I`m suffering of very long and very disk-intensive
peering process