[ceph-users] Re: SLOW_OPS problems

Tim Sauerbein Mon, 30 Sep 2024 07:15:35 -0700

Thanks for the replies everyone!

> On 30 Sep 2024, at 13:10, Anthony D'Atri <a...@dreamsnake.net> wrote:
> 
> Remember that slow ops are a top of the iceberg thing, you only see ones that 
> crest above 30s

So far metrics of the hosted VMs show no other I/O slowdown except when these 
hiccups occur.

> On 30 Sep 2024, at 13:35, Igor Fedotov <igor.fedo...@croit.io> wrote:
> 
> there is no log attached to your post, you better share it via some other 
> means.
> 
> BTW - what log did you mean - monitor or OSD one?
> 
> It would be nice to have logs for a couple of OSDs suffering from slow ops, 
> preferably relevant to two different cases.

Sorry, the attachments have apparently been stripped. See here for one incident 
(they all look the same but I can share more if relevant) monitor log, affected 
osd logs, iostat log:

https://gist.github.com/sauerbein/5a485a6d2546475912709743e3cfbf4b

Let me know if you need any other logs to analyse!

> On 30 Sep 2024, at 14:34, Alexander Schreiber <a...@thangorodrim.ch> wrote:
> 
> One cause for "slow ops" I discovered are networking issues. I had slow
> ops across my entire cluster (interconnected with 10G). Turns out the
> switch was bad an achieved < 10 MBit/s on one of the 10G links.
> Replaced the switch, tested the links again - got full 10G connectivity
> and the slow ops disappeared.

Thanks for the idea. The hosts are connected to two switches with fail-over 
bonding, normally communicating via the same switch. I will move them all over 
to the second switch to rule out a switch issue.

Best regards,
Tim
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: SLOW_OPS problems

Reply via email to