This is fantastic stuff. Thanks!
On Sun, Apr 13, 2014 at 10:17 AM, Dan Van Der Ster < daniel.vanders...@cern.ch> wrote: > For our cluster we monitor write latency by running a short (10s) rados > bench with one thread writing 64kB objects, every 5 minutes or so. rados > bench tells you the min, max, and average of those writes -- we plot them > all. An example is attached. > > The latency and other metrics that we plot (including iops) are here in > this sensor: > https://github.com/cernceph/ceph-scripts/blob/master/cern-sls/ceph-sls.py > Unfortunately it is not directly usable by others since it has been > written for our local monitoring system. > > Cheers, Dan > > > > > > > ------------------------------ > *From:* ceph-users-boun...@lists.ceph.com [ > ceph-users-boun...@lists.ceph.com] on behalf of Jason Villalta [ > ja...@rubixnet.com] > *Sent:* 12 April 2014 16:41 > *To:* Greg Poirier > *Cc:* ceph-users@lists.ceph.com > *Subject:* Re: [ceph-users] Useful visualizations / metrics > > I know ceph throws some warnings if there is high write latency. But i > would be most intrested in the delay for io requests, linking directly to > iops. If iops start to drop because the disk are overwhelmed then latency > for requests would be increasing. This would tell me that I need to add > more OSDs/Nodes. I am not sure there is a specific metric in ceph for this > but it would be awesome if there was. > > > On Sat, Apr 12, 2014 at 10:37 AM, Greg Poirier <greg.poir...@opower.com>wrote: > >> Curious as to how you define cluster latency. >> >> >> On Sat, Apr 12, 2014 at 7:21 AM, Jason Villalta <ja...@rubixnet.com>wrote: >> >>> Hi, i have not don't anything with metrics yet but the only ones I >>> personally would be interested in is total capacity utilization and cluster >>> latency. >>> >>> Just my 2 cents. >>> >>> >>> On Sat, Apr 12, 2014 at 10:02 AM, Greg Poirier <greg.poir...@opower.com >>> > wrote: >>> >>>> I'm in the process of building a dashboard for our Ceph nodes. I was >>>> wondering if anyone out there had instrumented their OSD / MON clusters and >>>> found particularly useful visualizations. >>>> >>>> At first, I was trying to do ridiculous things (like graphing % used >>>> for every disk in every OSD host), but I realized quickly that that is >>>> simply too many metrics and far too visually dense to be useful. I am >>>> attempting to put together a few simpler, more dense visualizations like... >>>> overcall cluster utilization, aggregate cpu and memory utilization per osd >>>> host, etc. >>>> >>>> Just looking for some suggestions. Thanks! >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>> >>> >>> -- >>> -- >>> *Jason Villalta* >>> Co-founder >>> [image: Inline image 1] >>> 800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/> >>> >> >> > > > -- > -- > *Jason Villalta* > Co-founder > [image: Inline image 1] > 800.799.4407x1230 | www.RubixTechnology.com<http://www.rubixtechnology.com/> >
<<inline: EmailLogo.png>>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com