Hi,

after upgrading Ceph from 14.2.8 to 14.2.16 we experienced increased latencies. 
There were no changes in hardware, configuration, workload or networking, just 
a rolling-update via ceph-ansible on running production cluster. The cluster 
consists of 16 OSDs (all SSD) over 4 Nodes. The VMs served via RBD from this 
cluster currently suffer on i/o wait cpu.

These are some latencies that are increased after the update:
- op_r_latency
- op_w_latency
- kv_final_lat
- state_kv_commiting_lat
- submit_lat
- subop_w_latency

Do these latencies point to KV/RocksDB? 

These  are some latencies which are NOT increased after the update:
- kv_sync_lat
- kv_flush_lat
- kv_commit_lat

I attached one graph showing the massive increase after the update.

I tried setting bluefs_buffered_io=true (as it’s default value was changed and 
it was mentioned as performance relevant) for all OSDs in one host but this 
does not make a difference. 

The ceph.conf is fairly simple:

[global]
cluster network = xxx
fsid = xxx
mon host = xxx
public network = xxx

[osd]
osd memory target = 10141014425

Any ideas what to try? Help appreciated.

Björn






-- 

dbap GmbH 
phone +49 251 609979-0 / fax +49 251 609979-99
Heinr.-von-Kleist-Str. 47, 48161 Muenster, Germany
http://www.dbap.de

dbap GmbH, Sitz: Muenster
HRB 5891, Amtsgericht Muenster
Geschaeftsfuehrer: Bjoern Dolkemeier

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to