RE: Central monitoring of Cassandra cluster

Gregory Szorc Thu, 24 Mar 2011 12:13:21 -0700

We have Collectd (http://www.collectd.org/) monitoring Cassandra via its 
Java/JMX plugin. Collectd feeds data to a central Graphite/Carbon 
(http://graphite.wikidot.com/start) instance via 
https://github.com/indygreg/collectd-carbon . With Graphite, you can 
effortlessly utilize the web UI (or HTTP API) to build and save graph 
definitions that sum/display/etc related values over the whole cluster. You can 
also utilize Graphite's HTTP API to export raw data. Your monitoring 
infrastructure could then poll this for alerting.


I have a script that parses a storage-conf.xml file into a Collectd config 
snippet. But, I don't have that posted in public domain at the moment. In lieu 
of that, here are some samples that work with Cassandra 0.6.12:

Add the following to a Collectd types file:

cassandra_pool               active:GAUGE:0:U, pending:GAUGE:0:U, 
completed:COUNTER:0:U
cassandra_stage              active:GAUGE:0:2147483648, 
pending:GAUGE:0:2147483648, completed:COUNTER:0:U
cassandra_cf_cache           rcnt_hit_rate:GAUGE:0:1, size:GAUGE:0:2147483648, 
capacity:GAUGE:0:2147483648, hits:COUNTER:0:U, requests:COUNTER:0:U
cassandra_cf_store           pending_tasks:GAUGE:0:2147483648, 
min_row_size:GAUGE:0:U, max_row_size:GAUGE:0:U, mean_row_size:GAUGE:0:U, 
memtbl_col_cnt:GAUGE:0:U, memtbl_data_size:GAUGE:0:U, 
memtbl_switch_cnt:COUNTER:0:U, read_cnt:COUNTER:0:U, 
rcnt_rd_latency:GAUGE:0:2147483648, tot_rd_latency:COUNTER:0:U, 
write_cnt:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648, 
tot_wr_latency:COUNTER:0:U, disk_used_total:GAUGE:0:U, 
disk_used_live:GAUGE:0:U, ss_table_count:GAUGE:0:1000000, 
bloom_false_pos:COUNTER:0:U, bloom_rcnt_f_ratio:GAUGE:0:1, 
bloom_false_ratio:GAUGE:0:1
cassandra_compaction_manager pending:GAUGE:0:U, bytes_in_progress:GAUGE:0:U, 
bytes_compacted:GAUGE:0:U
cassandra_storage_proxy      rcnt_rd_latency:GAUGE:0:2147483648, 
tot_rd_latency:COUNTER:0:U, rcnt_wr_latency:GAUGE:0:2147483648, 
tot_wr_latency:COUNTER:0:U, read_operations:COUNTER:0:U, 
range_operations:COUNTER:0:U tot_rg_latency:COUNTER:0:U 
rcnt_rg_latency:GAUGE:0:2147483648 write_operations:COUNTER:0:U

(The weird names are due to a character length limitation in Collectd, which 
enforces the restrictions of RRD, since it uses that out of the box.)

And add the following to your Collectd config file:

<Plugin "java">
    JVMArg 
"-Djava.class.path=/usr/share/collectd/java/collectd-api.jar:/usr/share/collectd/java/generic-jmx.jar"
    LoadPlugin "org.collectd.java.GenericJMX"
    <Plugin "GenericJMX">

        <!-- this will read a stage mbean -->
        <MBean "cassandra-row-read-stage">
            ObjectName "org.apache.cassandra.concurrent:type=ROW-READ-STAGE"
            InstancePrefix "cassandra_row_read_stage"
            <Value>
                Type "cassandra_stage"
                Attribute "ActiveCount"
                Attribute "PendingTasks"
                Attribute "CompletedTasks"
            </Value>
        </MBean>
       <!-- this will read a specific column family mbean -->
        <MBean "cassandra-cf-foo">
            ObjectName 
"org.apache.cassandra.db:columnfamily=Foo,keyspace=KeySpace,type=ColumnFamilyStores"
            InstancePrefix "cassandra_cf_foo"
            <Value>
                Type "cassandra_cf_store"
                Attribute "PendingTasks"
                Attribute "MinRowCompactedSize"
                Attribute "MaxRowCompactedSize"
                Attribute "MeanRowCompactedSize"
                Attribute "MemtableColumnsCount"
                Attribute "MemtableDataSize"
                Attribute "MemtableSwitchCount"
                Attribute "ReadCount"
                Attribute "RecentReadLatencyMicros"
                Attribute "TotalReadLatencyMicros"
                Attribute "WriteCount"
                Attribute "RecentWriteLatencyMicros"
                Attribute "TotalWriteLatencyMicros"
                Attribute "TotalDiskSpaceUsed"
                Attribute "LiveDiskSpaceUsed"
                Attribute "LiveSSTableCount"
                Attribute "BloomFilterFalsePositives"
                Attribute "RecentBloomFilterFalseRatio"
                Attribute "BloomFilterFalseRatio"
            </Value>
        </MBean>
      <!-- this defines what to connect to and what to collect. I /think/ you 
can define multiple connections to monitor many Cassandra instances from one 
Collectd instance, but haven't tried this -->
      <Connection>
            Host "cassandra"
            Collect "cassandra-row-read-stage"
            Collect "cassandra-cf-foo"
            ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:8080/jmxrmi"
        </Connection>
    </Plugin>
</Plugin>

Greg

> -----Original Message-----
> From: mcasandra [mailto:[email protected]]
> Sent: Thursday, March 24, 2011 11:45 AM
> To: [email protected]
> Subject: Central monitoring of Cassandra cluster
> 
> Can someone share if they have centralized monitoring for all cassandra
> servers. With many nodes it becomes difficult to monitor them individually
> unless we can look at data in one place. I am looking at solutions where this
> can be done. Looking at Cacti currently but not sure how to integrate it with
> JMX.
> 
> --
> View this message in context: http://cassandra-user-incubator-apache-
> org.3065146.n2.nabble.com/Central-monitoring-of-Cassandra-cluster-
> tp6205275p6205275.html
> Sent from the [email protected] mailing list archive at
> Nabble.com.

RE: Central monitoring of Cassandra cluster

Reply via email to