>>Can you capture a perf top or perf record to see where teh CPU time is 
>>going on one of the OSDs wth a high latency?

Yes, sure. I'll do it next week and send result to the mailing list.

Thanks Sage !
----- Mail original -----
De: "Sage Weil" <s...@newdream.net>
À: "aderumier" <aderum...@odiso.com>
Cc: "ceph-users" <ceph-users@lists.ceph.com>, "ceph-devel" 
Envoyé: Vendredi 25 Janvier 2019 10:49:02
Objet: Re: ceph osd commit latency increase over time, until restart

Can you capture a perf top or perf record to see where teh CPU time is 
going on one of the OSDs wth a high latency? 


On Fri, 25 Jan 2019, Alexandre DERUMIER wrote: 

> Hi, 
> I have a strange behaviour of my osd, on multiple clusters, 
> All cluster are running mimic 13.2.1,bluestore, with ssd or nvme drivers, 
> workload is rbd only, with qemu-kvm vms running with librbd + snapshot/rbd 
> export-diff/snapshotdelete each day for backup 
> When the osd are refreshly started, the commit latency is between 0,5-1ms. 
> But overtime, this latency increase slowly (maybe around 1ms by day), until 
> reaching crazy 
> values like 20-200ms. 
> Some example graphs: 
> http://odisoweb1.odiso.net/osdlatency1.png 
> http://odisoweb1.odiso.net/osdlatency2.png 
> All osds have this behaviour, in all clusters. 
> The latency of physical disks is ok. (Clusters are far to be full loaded) 
> And if I restart the osd, the latency come back to 0,5-1ms. 
> That's remember me old tcmalloc bug, but maybe could it be a bluestore memory 
> bug ? 
> Any Hints for counters/logs to check ? 
> Regards, 
> Alexandre 

ceph-users mailing list

Reply via email to