Hello cephers!

I have luminous 12.2.5 cluster of 3 nodes 5 OSDs each with S3 RGW. All OSDs are HDD.

I often (about twice a day) have slow request problem which reduces cluster efficiency. It can be started both in day peak and night time. Doesn't matter.

That's what I have in ceph health detail https://avatars.mds.yandex.net/get-pdb/234183/9ba023d0-4352-4235-8826-76b412016e9f/s1200

Top and iostat results on osd.21's node
https://avatars.mds.yandex.net/get-pdb/51720/52ef79c1-eb1a-450a-8c95-675077045b84/s1200

https://avatars.mds.yandex.net/get-pdb/51720/0d98131c-82d3-4274-a406-743490e1f966/s1200

In fact in reduces cluster's io operations for about an half an hour twice a day
https://avatars.mds.yandex.net/get-pdb/222681/bed8f638-f259-403e-83cb-c7bfb30f14f1/s1200

That's normal io while status is OK
https://avatars.mds.yandex.net/get-pdb/245485/33ee3a53-083a-4656-b585-8df0007db2e2/s1200

That's how it affects on incoming traffic to RGW https://avatars.mds.yandex.net/get-pdb/51720/5a486d30-0d44-46f0-8f0f-668a05947bc8/s1200

Since it starts in any time but twice a day and for fixed period of time I assume it could be some recovery or rebalancing operations.

I tried to find smth out in osd logs but there are nothing about it.

Any thoughts how to avoid it?

Appreciate your help.

--
Grigory Murashov

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to