We are already gathering the Ceph admin socket stats with the Diamond
plugin and sending that to graphite, so I guess I just need to look through
that to find what I'm looking for.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Fri, Mar 13, 2020 at 4:48 PM Anthony D'Atri <anthony.da...@gmail.com>
wrote:

> Yeah the removal of that was annoying for sure.  ISTR that one can gather
> the information from the OSDs’ admin sockets.
>
> Envision a Prometheus exporter that polls the admin sockets (in parallel)
> and Grafana panes that graph slow requests by OSD and by node.
>
>
> > On Mar 13, 2020, at 4:14 PM, Robert LeBlanc <rob...@leblancnet.us>
> wrote:
> >
> > For Jewel I wrote a script to take the output of `ceph health detail
> > --format=json` and send alerts to our system that ordered the osds based
> on
> > how long the ops were blocked and which OSDs had the most ops blocked.
> This
> > was really helpful to quickly identify which OSD out of a list of 100
> would
> > be the most probable one having issues. Since upgrading to Luminous, I
> > don't get that and I'm not sure where that info went to. Do I need to
> query
> > the manager now?
> >
> > This is the regex I was using to extract the pertinent information:
> >
> > '^(\d+) ops are blocked > (\d+\.+\d+) sec on osd\.(\d+)$'
> >
> > Thanks,
> > Robert LeBlanc
> > ----------------
> > Robert LeBlanc
> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to