Hi,

In our use case, we maintain minute-wise roll ups for different metrics.
These are stored in a counter column family where the row key is a
composite containing the timestamp rounded to the last minute and an
integer between 0-9 (This integer is calculated as the MD5 hash of the
metric mod 10). The column names are the metrics we wish to track.
Typically, each row has about 100,000 counters.

We tested two scenarios. The first one is as mentioned above. In this case
we got a per write latency of about 80 micro-seconds to 100 micro-seconds.

In the other scenario, we calculated the integer in the row key as mod 100.
In this case we observed a per write latency of 50 micro-seconds to 70
micro-seconds.

I wish to understand why updates to counters were faster as they got spread
across multiple rows?

Cluster summary : 4 nodes running Cassandra 1.0.5. Each with 8 cores, 32G
RAM, 10G Cassandra heap. We are using replication factor of 2.


-- 
Thanks!
Amit Chavan

Reply via email to