I've just started on this myself..

I started with https://ceph.com/docs/v0.80/dev/perf_counters/

I'm currently monitoring the latency, using the (to pick one example)
[op_w_latency][sum] and [op_w_latency][avgcount].  Both values are
counters, so they only increase with time.  The lifetime average latency of
the cluster isn't verify useful, so I track the deltas of those values,
then divide the recent deltas to get the average latency over my sample
period.

Just graphing the latencies let me see a spike in write latency on all
disks on one node, which eventually led me to a dead write-cache battery.


That's for the OSDs.  I have similar things setup for MON and RadosGW.


I'm sure there are many more useful things to graph.  One of things I'm
interested in (but haven't found time to research yet) is the journal
usage, with maybe some alerts if the journal is more than 90% full.



On Mon, Oct 13, 2014 at 2:57 PM, Jakes John <jakesjohn12...@gmail.com>
wrote:

> Bump:). It would be helpful, if someone can share info related to
> debugging using counters/stats
>
> On Sun, Oct 12, 2014 at 7:42 PM, Jakes John <jakesjohn12...@gmail.com>
> wrote:
>
>> Hi All,
>>           I would like to know if there are useful performance counters
>> in ceph which can help to debug the cluster. I have seen hundreds of stat
>> counters in various daemon dumps. Some of them are,
>>
>> 1. commit_latency_ms
>> 2. apply_latency_ms
>> 3. snap_trim_queue_len
>> 4. num_snap_trimming
>>
>> What do these indicate?. .
>>
>> I have used iostat, atop for cluster statistics but, none of them
>> indicate the internal ceph status.  Machines might be new but, osds can
>> still be slow.  If some of these counters can help to debug why certain
>> osds are bad( or can get bad later), it would be great. Some counters like
>> total processed requests, pending requests in queue, avg time taken to
>> process a request etc ?
>>
>>
>> Are there any docs for all performance counters which I can read?. I
>> couldn't find anything in ceph docs.
>>
>> Thanks
>>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to