2016-07-14 15:26 GMT+02:00 Luis Periquito <periqu...@gmail.com>:

> Hi Jaroslaw,
>
> several things are springing up to mind. I'm assuming the cluster is
> healthy (other than the slow requests), right?
>
>
Yes.



> From the (little) information you send it seems the pools are
> replicated with size 3, is that correct?
>
>
True.


> Are there any long running delete processes? They usually have a
> negative impact on performance, specially as they don't really show up
> in the IOPS statistics.
>

During normal troughput we have small amount of deletes.


> I've also something like this happen when there's a slow disk/osd. You
> can try to check with "ceph osd perf" and look for higher numbers.
> Usually restarting that OSD brings back the cluster to life, if that's
> the issue.
>

I will check this.



> If nothing shows, try a "ceph tell osd.* version"; if there's a
> misbehaving OSD they usually don't respond to the command (slow or
> even timing out).
>
> Also you also don't say how many scrub/deep-scrub processes are
> running. If not properly handled they are also a performance killer.
>
>
Scrub/deep-scrub processes are disabled


Last, but by far not least, have you ever thought of creating a SSD
> pool (even small) and move all pools but .rgw.buckets there? The other
> ones are small enough, but enjoy having their own "reserved" osds...
>
>
>

This is one idea we had some time ago, we will try that.

One important thing:

sysop@s41617:~/bin$ ceph osd pool get .rgw.buckets pg_num
pg_num: 4470
sysop@s41617:~/bin$ ceph osd pool get .rgw.buckets.index pg_num
pg_num: 2048

Could be this a main problem?


Regards
-- 
Jarek
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to