Hi All,
I'm kind of crossposting this from here:
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that it's a ceph problem I'll try my
luck here.
Since updating from Luminous to Nautilus I have a big prob
/07/2020 15:04, Wido den Hollander wrote:
On 29/07/2020 14:52, Raffael Bachmann wrote:
Hi All,
I'm kind of crossposting this from here:
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that i
action events
by running this script:
https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py
That can give you an idea of how long your compaction events are
lasting and what they are doing.
Mark
On 7/29/20 7:52 AM, Raffael Bachmann wrote:
Hi All,
I'm kind of
"log_latency_fn slow operation observed for"
lines?
Have you tried "osd bench" command for your OSDs? Does it show similar
numbers for every OSD?
You might want to try manual offline DB compaction using
ceph-kvstore-tool. Any improvements after that?
Thanks,
Igor
On 7
TQo/edit?usp=sharing
No wonder you are seeing periodic stalls. How many DBs per NVMe
drive? What's your cluster workload typically like? Also, can you see
if the NVMe drive aqu-sz is getting big waiting for the requests to be
serviced?
Mark
On 7/29/20 8:35 AM, Raffael Bachmann wrot