Re: Cassandra read throughput with little/no caching.

Tyler Hobbs Thu, 03 Jan 2013 08:42:06 -0800

>
> Your description above was much better :-) I'm more interested in docs for
> the raw metrics provided in JMX.



I don't think there are any good docs for what is exposed directly through
JMX.  Most of the OpsCenter metrics map closely to one exposed JMX item, so
that's a start.  Other than that, your best bet at the moment for accurate
descriptions is to read the source.  Methods and attributes exposed through
JMX follow a particular format (MBean classes, naming conventions) that
make them pretty easy to find.

That would be a great feature, but it's quite difficult taking
> high-resolution data capture without disturbing the system you're trying to
> measure.
>

True, but fortunately Cassandra doesn't tend to be CPU bound, so simply
sampling JMX data doesn't tend to impact normal performance metrics.

Perhaps worth taking the data-capture points off-list?


Sure, I'd love to hear your ideas.


On Wed, Jan 2, 2013 at 11:41 AM, James Masson <james.mas...@opigram.com>wrote:

>
>
> On 02/01/13 16:18, Tyler Hobbs wrote:
>
>> On Wed, Jan 2, 2013 at 5:28 AM, James Masson <james.mas...@opigram.com
>> <mailto:james.masson@opigram.**com <james.mas...@opigram.com>>> wrote:
>>
> >
>
>> 1) Hector sends a request to some node in the cluster, which will act as
>> the coordinator.
>> 2) The coordinator then sends the actual read requests out to each of
>> the (RF) replicas.
>> 3a) The coordinator waits for responses from the replicas; how many it
>> waits for depends on the consistency level.
>> 3b) The replicas perform actual cache/memtable/sstable reads and respond
>> to the coordinator when complete
>> 4) Once the required number of replicas have responded, the coordinator
>> replies to the client (Hector).
>>
>> The Read Request Latency metric is measuring the time taken in steps 2
>> through 4.  The CF Local Read Latency metric is only capturing the time
>> taken in step 3b.
>>
>>
>>
> Great, that's exactly the level of detail I'm looking for.
>
>
>
>>
>>     Is there anywhere I can find concrete definitions of what the stats
>>     in OpsCenter, and raw Cassandra via JMX mean? The docs I've found
>>     seem quite ambiguous.
>>
>>
>> This has pretty good writeups of each:
>> http://www.datastax.com/docs/**opscenter/online_help/**
>> performance/index#opscenter-**performance-metrics<http://www.datastax.com/docs/opscenter/online_help/performance/index#opscenter-performance-metrics>
>>
>
> Your description above was much better :-) I'm more interested in docs for
> the raw metrics provided in JMX.
>
>
>
>>
>>     I still think that the data resolution that OpsCenter gives makes it
>>     more suitable for trending/alerting rather than chasing down tricky
>>     performance issues. This sort of investigation work is what I do for
>>     a living, I typically use intervals of 10 seconds or lower, and
>>     don't average my data. Although, storing your data inside the
>>     database your measuring does restrict your options a little :-)
>>
>>
>> True, there's a limit to what you can detect with 60 second resolution.
>> We've considered being able to report metrics at a finer resolution
>> without durably storing them anywhere, which would be useful for when
>> you're actively watching the cluster.
>>
>
> That would be a great feature, but it's quite difficult taking
> high-resolution data capture without disturbing the system you're trying to
> measure.
>
> Perhaps worth taking the data-capture points off-list?
>
> James M
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra read throughput with little/no caching.

Reply via email to