Hi Paul,

Quoting Paul Emmerich (paul.emmer...@croit.io):
> https://static.croit.io/ceph-training-examples/ceph-training-example-admin-socket.pdf

Thanks for the link. So, what tool do you use to gather the metrics? We
are using telegraf module of the Ceph manager. However, this module only
provides "sum" and not "avgtime" so I can't do the calculations. The
influx and zabbix mgr modules also only provide "sum". The only metrics
module that *does* send "avgtime" is the prometheus module:

ceph_mds_reply_latency_sum
ceph_mds_reply_latency_count

All modules use "self.get_all_perf_counters()" though:

~/git/ceph/src/pybind/mgr/ > grep -Ri get_all_perf_counters *
dashboard/controllers/perf_counters.py:        return 
mgr.get_all_perf_counters()
diskprediction_cloud/agent/metrics/ceph_mon_osd.py:        perf_data = 
obj_api.module.get_all_perf_counters(services=('mon', 'osd'))
influx/module.py:        for daemon, counters in 
six.iteritems(self.get_all_perf_counters()):
mgr_module.py:    def get_all_perf_counters(self, prio_limit=PRIO_USEFUL,
prometheus/module.py:        for daemon, counters in 
self.get_all_perf_counters().items():
restful/api/perf.py:        counters = context.instance.get_all_perf_counters()
telegraf/module.py:        for daemon, counters in 
six.iteritems(self.get_all_perf_counters())

Besides the *ceph* telegraf module we also use the ceph plugin for
telegraf ... but that plugin does not (yet?) provide mds metrics though.
Ideally we would *only* use the ceph mgr telegraf module to collect *all
the things*.

Not sure what's the difference in python code between the modules that could 
explain this.

Gr. Stefan

-- 
| BIT BV  https://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / i...@bit.nl
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to