Re: read request distribution

Wei Zhu Tue, 13 Nov 2012 12:43:34 -0800

I am new to Cassandra, and 1.1.6 is the only version I have tested. Not sure 
about the old behavior, for 1.1.6, my observation is that for brand new cluster 
(with no CF created), it shows Ownership from nodetool ring, the value 
is 100/<nodes>. As soon as one CF is created, the column changes to Effective 
Ownership and the formula seems to be 100*<replication factor>/<nodes> as Kirk 
mentioned. .Theoretically, different keyspace can have different replication 
factor. Not sure how Effective Ownership is calculated in that cases. Just 
curious anyone knows?


Thanks.
-Wei


________________________________
 From: Kirk True <k...@mustardgrain.com>
To: user@cassandra.apache.org 
Sent: Monday, November 12, 2012 4:24 PM
Subject: Re: read request distribution
 

 
Somewhat recently the Ownership column was changed to Effective Ownership. 

 
Previously the formula was essentially 100/<nodes>. Now it's 100*<replication 
factor>/<nodes>. So in previous releases of Cassandra it would be 100/12 = 
8.33, now it would be closer to 25% (8.33*3 (assuming a replication factor of 
three)).

 
Kirk

 
On Mon, Nov 12, 2012, at 03:52 PM, Ananth Gundabattula wrote:

Hi all,
>
> 
>On an unrelated observation of the below readings, it looks like all the 3 
>nodes own 100% of the data. This confuses me a bit. We have a 12 node cluster 
>with RF=3 but the effective ownership is shown as 8.33 % . 
>
> 
>So here is my question. How is the ownership calculated : Is Replica factor 
>considered in the ownership calculation ? ( If yes , then 8.33 % ownership of 
>a cluster seems wrong to me . If not 100% ownership for a node cluster seems 
>wrong to me. Am I missing something in the calculation? 
>
> 
>Regards,
>
>Ananth
>
> 
>On Fri, Nov 9, 2012 at 4:37 PM, Wei Zhu <wz1...@yahoo.com> wrote:
>
>Hi All,
>>
>>I am doing a benchmark on a Cassandra. I have a three node cluster with RF=3. 
>>I generated 6M rows with sequence  number from 1 to 6m, so the rows should be 
>>evenly distributed among the three nodes disregarding the replicates. 
>>
>>I am doing a benchmark with read only requests, I generate read request for 
>>randomly generated keys from 1 to 6M. Oddly, nodetool cfstats, reports that 
>>one node has only half the requests as the other one and the third node sits 
>>in the middle. So the ratio is like 2:3:4. The node with the most read 
>>requests actually has the smallest latency and the one with the least read 
>>requests reports the largest latency. The difference is pretty big, the 
>>fastest is almost double the slowest.
>>
>>All three nodes have the exactly the same hardware and the data size on each 
>>node are the same since the RF is three and all of them have the complete 
>>data. I am using Hector as client and the random read request are in 
>>millions. I can't think of a reasonable explanation.  Can someone please shed 
>>some lights?
>>
>> 
>>Thanks.
>>-Wei
>>

Re: read request distribution

Reply via email to