We have Collectd (http://www.collectd.org/) monitoring Cassandra via its
Java/JMX plugin. Collectd feeds data to a central Graphite/Carbon
(http://graphite.wikidot.com/start) instance via
https://github.com/indygreg/collectd-carbon . With Graphite, you can
effortlessly utilize the web UI (or HTTP API) to build and save graph
definitions that sum/display/etc related values over the whole cluster. You can
also utilize Graphite's HTTP API to export raw data. Your monitoring
infrastructure could then poll this for alerting.
I have a script that parses a storage-conf.xml file into a Collectd config
snippet. But, I don't have that posted in public domain at the moment. In lieu
of that, here are some samples that work with Cassandra 0.6.12:
Add the following to a Collectd types file:
cassandra_pool active:GAUGE:0:U, pending:GAUGE:0:U,
completed:COUNTER:0:U
cassandra_stage active:GAUGE:0:2147483648,
pending:GAUGE:0:2147483648, completed:COUNTER:0:U
cassandra_cf_cache rcnt_hit_rate:GAUGE:0:1, size:GAUGE:0:2147483648,
capacity:GAUGE:0:2147483648, hits:COUNTER:0:U, requests:COUNTER:0:U
cassandra_cf_store pending_tasks:GAUGE:0:2147483648,
min_row_size:GAUGE:0:U, max_row_size:GAUGE:0:U, mean_row_size:GAUGE:0:U,
memtbl_col_cnt:GAUGE:0:U, memtbl_data_size:GAUGE:0:U,
memtbl_switch_cnt:COUNTER:0:U, read_cnt:COUNTER:0:U,
rcnt_rd_latency:GAUGE:0:2147483648, tot_rd_latency:COUNTER:0:U,
write_cnt:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648,
tot_wr_latency:COUNTER:0:U, disk_used_total:GAUGE:0:U,
disk_used_live:GAUGE:0:U, ss_table_count:GAUGE:0:1000000,
bloom_false_pos:COUNTER:0:U, bloom_rcnt_f_ratio:GAUGE:0:1,
bloom_false_ratio:GAUGE:0:1
cassandra_compaction_manager pending:GAUGE:0:U, bytes_in_progress:GAUGE:0:U,
bytes_compacted:GAUGE:0:U
cassandra_storage_proxy rcnt_rd_latency:GAUGE:0:2147483648,
tot_rd_latency:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648,
tot_wr_latency:COUNTER:0:U, read_operations:COUNTER:0:U,
range_operations:COUNTER:0:U tot_rg_latency:COUNTER:0:U
rcnt_rg_latency:GAUGE:0:2147483648 write_operations:COUNTER:0:U
(The weird names are due to a character length limitation in Collectd, which
enforces the restrictions of RRD, since it uses that out of the box.)
And add the following to your Collectd config file:
<Plugin "java">
JVMArg
"-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/generic-jmx.jar"
LoadPlugin "org.collectd.java.GenericJMX"
<Plugin "GenericJMX">
<!-- this will read a stage mbean -->
<MBean "cassandra-row-read-stage">
ObjectName "org.apache.cassandra.concurrent:type=ROW-READ-STAGE"
InstancePrefix "cassandra_row_read_stage"
<Value>
Type "cassandra_stage"
Attribute "ActiveCount"
Attribute "PendingTasks"
Attribute "CompletedTasks"
</Value>
</MBean>
<!-- this will read a specific column family mbean -->
<MBean "cassandra-cf-foo">
ObjectName
"org.apache.cassandra.db:columnfamily=Foo,keyspace=KeySpace,type=ColumnFamilyStores"
InstancePrefix "cassandra_cf_foo"
<Value>
Type "cassandra_cf_store"
Attribute "PendingTasks"
Attribute "MinRowCompactedSize"
Attribute "MaxRowCompactedSize"
Attribute "MeanRowCompactedSize"
Attribute "MemtableColumnsCount"
Attribute "MemtableDataSize"
Attribute "MemtableSwitchCount"
Attribute "ReadCount"
Attribute "RecentReadLatencyMicros"
Attribute "TotalReadLatencyMicros"
Attribute "WriteCount"
Attribute "RecentWriteLatencyMicros"
Attribute "TotalWriteLatencyMicros"
Attribute "TotalDiskSpaceUsed"
Attribute "LiveDiskSpaceUsed"
Attribute "LiveSSTableCount"
Attribute "BloomFilterFalsePositives"
Attribute "RecentBloomFilterFalseRatio"
Attribute "BloomFilterFalseRatio"
</Value>
</MBean>
<!-- this defines what to connect to and what to collect. I /think/ you
can define multiple connections to monitor many Cassandra instances from one
Collectd instance, but haven't tried this -->
<Connection>
Host "cassandra"
Collect "cassandra-row-read-stage"
Collect "cassandra-cf-foo"
ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:8080/jmxrmi"
</Connection>
</Plugin>
</Plugin>
Greg
> -----Original Message-----
> From: mcasandra [mailto:[email protected]]
> Sent: Thursday, March 24, 2011 11:45 AM
> To: [email protected]
> Subject: Central monitoring of Cassandra cluster
>
> Can someone share if they have centralized monitoring for all cassandra
> servers. With many nodes it becomes difficult to monitor them individually
> unless we can look at data in one place. I am looking at solutions where this
> can be done. Looking at Cacti currently but not sure how to integrate it with
> JMX.
>
> --
> View this message in context: http://cassandra-user-incubator-apache-
> org.3065146.n2.nabble.com/Central-monitoring-of-Cassandra-cluster-
> tp6205275p6205275.html
> Sent from the [email protected] mailing list archive at
> Nabble.com.