Yeah.
I'm monitoring such issue reports for a while and it looks like
something is definitely wrong with response times under certain
circumstances. Mpt sure if all these reports have the same root cause
though.
Scrubbing seems to be one of the trigger.
Perhaps we need more low-level detect
Hi Igor,
Thanks for your reply.
I can verify, discard is disabled in our cluster:
10:03 root@node106b [fra]:~# ceph daemon osd.417 config show | grep discard
"bdev_async_discard": "false",
"bdev_enable_discard": "false",
[...]
So there must be something else causing the problems.
Thanks
Hi Denny,
Do not remember exactly when discards appeared in BlueStore but they are
disabled by default:
See bdev_enable_discard option.
Thanks,
Igor
On 2/15/2019 2:12 PM, Denny Kreische wrote:
Hi,
two weeks ago we upgraded one of our ceph clusters from luminous 12.2.8 to
mimic 13.2.4, c
Hi,
two weeks ago we upgraded one of our ceph clusters from luminous 12.2.8 to
mimic 13.2.4, cluster is SSD-only, bluestore-only, 68 nodes, 408 OSDs.
somehow we see strange behaviour since then. Single OSDs seem to block for
around 5 minutes and this causes the whole cluster and connected applic