I'm not sure if the problems we are seeing are the same, but it looks like
it. Just a few hours ago, one slow OSD caused a lot of problems for us. It
is somehow reported down, and while cluster was trying to adjust, it said
it was wrongly marked down. So it seems some pgs were stuck in peering. We
restarted the OSD, cluster adjusted, after a while it is reported down
again and the whole process repeated. We thought we should keep the OSD
down, set noup, waited a while, with no luck, repeated. Even if there seems
no hardware problem we decided to set the osd out and started recovery.
Initial peering as you said seems so much resource intensive that it caused
another ~10 OSDs to be reported down, which increased the number of pgs in
peering, then they all said they're wrongly marked down... We already
lowered all the recovery parameters, it takes about 2-3 hours now, but that
doesn't make any difference in the starting phase of the recovery process
which may take up to 10 minutes. We have RBD backed KVM instances and they
are totally frozen for those 10 minutes. And if some pgs are stuck in
peering, it requires manual operation (a restart is what we could come up
with) before anything can actually continue working. We've found
http://www.spinics.net/lists/ceph-users/msg00009.html but it doesn't offer
much. We run 0.56.4.


On Thu, May 2, 2013 at 4:57 PM, Andrey Korolyov <and...@xdel.ru> wrote:

> Hello,
>
> Speaking of rotating-media-under-filestore case(must be most common in
> Ceph deployments), can peering be less greedy for disk operations
> without slowing down entire 'blackhole timeout', e.g. when it blocks
> client operations? I`m suffering of very long and very disk-intensive
> peering process even on relatively small reweighs on more or less
> significant commit on the underlying storage(50% are very hard to deal
> with, 10% of disk commit way more acceptable). Recovery by itself can
> be throttled low enough to not compete with I/O disk operations from
> clients but slowing peering process means freezing client` I/O for
> longer time, that`s all.
> Cuttlefish seems to do a part of disk controller` job for merging
> writes, but peering is still unacceptably long for _IOPS_-intensive
> cluster(5Mb/s and 800 IOPS on every disk during peering, despite
> controller aligning head movements, disks are 100% busy). SSD-based
> cluster which should not die under lack of IOPS, but prices for such
> thing still closer to the TrueEnterpriseStorage(tm) than any solution
> I can afford.
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
erdem agaoglu
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to