Re: Local Quorum Performance...

Ikeda Anthony Sat, 17 Sep 2011 21:23:33 -0700

I'm not sure if it's significant, but on first notice the IP addresses all have 
the same octets in the ProperyFileSnitch, yet the EC2Snitch, all the octets are 
different.


Ergo:
PropertyFileSnitch states that are all in the same data centre [168] and the 
same rac [2].
EC2Snitch states that all nodes in 3 different data centres [20, 73, 236].

I'm still new at this too and may not have the full answer as we are prepping 
our prod env with the PropertyFileSnitch 2DC's and 3 nodes per DC. Though our 
QA environment is configured much the same way only it's 3 nodes in a single DC:

consistency: LOCAL_QUORUM
strategy: NetworkTopologyStrategy
strategy_options: datacenter1:3

Our distribution is 33% equally.

Just reading the docs on the datastax website I'm starting to wonder how the 
PropertyFileSnitch distributes the data across the DC's:
For NetworkTopologyStrategy, it specifies the number of replicas per data 
center in a comma separated list of datacenter_name:number_of_replicas. 

I'm wondering if you need to increase your replication factor to 3 to see the 
data replicate across the DC's

Anthony


On 17/09/2011, at 8:36 PM, Chris Marino wrote:

> Anthony, We used the Ec2Snitch for one sets of runs, but for another set 
> we're using PropertyFileSnitch.
> 
> With the PropertyFileSnitch we see:
> 
> Address         DC          Rack        Status State   Load            Owns   
>  Token                                       
> 
>                                                                               
>  85070591730234615865843651857942052865      
> 
> 192.168.2.1     us-east     1b          Up     Normal  60.59 MB        50.00% 
>  0                                           
> 
> 192.168.2.6     us-west     1c          Up     Normal  26.5 MB         0.00%  
>  1                                           
> 
> 192.168.2.2     us-east     1b          Up     Normal  29.86 MB        50.00% 
>  85070591730234615865843651857942052864      
> 
> 192.168.2.7     us-west     1c          Up     Normal  60.63 MB        0.00%  
>  85070591730234615865843651857942052865   
> 
> 
> 
> While with the EC2Snitch wwe see:
> Address         DC          Rack        Status State   Load            Owns   
>  Token                                       
> 
>                                                                               
>  85070591730234615865843651857942052865      
> 
> 107.20.68.176   us-east     1b          Up     Normal  59.95 MB        50.00% 
>  0                                           
> 
> 204.236.179.193 us-west     1c          Up     Normal  53.67 MB        0.00%  
>  1                                           
> 
> 184.73.133.171  us-east     1b          Up     Normal  60.65 MB        50.00% 
>  85070591730234615865843651857942052864      
> 
> 204.236.166.4   us-west     1c          Up     Normal  26.33 MB        0.00%  
>  85070591730234615865843651857942052865     
> 
> 
> 
> What also strange is that the Load on the nodes changes as well. For example, 
> node 204.236.166.4 sometimes is very low (~26KB), other times its closer to 
> 30MB. We see the same kind of variability in both clusters.
> 
> 
> For both clusters, we're running stress tests with the following options:
> 
> 
> --consistency-level=LOCAL_QUORUM --threads=4 
> --replication-strategy=NetworkTopologyStrategy 
> --strategy-properties=us-east:2,us-west:2 --column-size=128 --keep-going 
> --num-keys=100000 -r
> 
> Any clues to what is going on here are greatly appreciated.
> 
> Thanks
> CM
> 
> On Sat, Sep 17, 2011 at 12:15 PM, Ikeda Anthony <anthony.ikeda....@gmail.com> 
> wrote:
> What snitch do you have configured? We typically see a proper spread of data 
> across all our nodes equally.
> 
> Anthony
> 
> 
> On 17/09/2011, at 10:06 AM, Chris Marino wrote:
> 
>> Hi, I have a question about what to expect when running a cluster across 
>> datacenters with Local Quorum consistency.
>> 
>> My simplistic assumption is that the performance of an 8 node cluster split 
>> across 2 data centers and running with local quorum would perform roughly 
>> the same as a 4 node cluster in one data center.
>> 
>> I'm 95% certain we've set up the keyspace so that the entire range is in one 
>> datacenter and the client is local. I see the keyspace split across all the 
>> local nodes, with remote nodes owning 0%. Yet when I run the stress tests 
>> against this configuration with local quorum, I see dramatically different 
>> results from when I ran the same tests against a 4 node cluster.  I'm still 
>> 5% unsure of this because the documentation on how to configure this is 
>> pretty thin.
>> 
>> My understanding of Local Quorum was that once the data was written to a 
>> local quorum, the commit would complete. I also believed that this would 
>> eliminate any WAN latency required for replication to the other DC.
>> 
>> It not just that the split cluster runs slower, its also that there is 
>> enormous variability in identical tests. Sometimes by a factor of 2 or more. 
>> It seems as though the WAN latency is not only impacting performance, but 
>> that it's also introducing a wide variation on overally performance.
>> 
>> Should WAN latency be completely hidden with local quorum? Or are there 
>> second order issues involved that will impact performance??
>> 
>> I'm running in EC2 across us-east/west regions. I already know how 
>> unpredictable EC2 performance can be, but what I'm seeing with here is far 
>> beyond normal.performance variability for EC2
>> 
>> Is there something obvious that I'm missing that would explain why the 
>> results are so different?? 
>> 
>> Here's the config when we run a 2x2 cluster:
>> 
>> Address         DC          Rack        Status State   Load            Owns  
>>   Token                                       
>>                                                                              
>>   85070591730234615865843651857942052865      
>> 192.168.2.1     us-east     1b          Up     Normal  25.26 MB        
>> 50.00%  0                                           
>> 192.168.2.6     us-west     1c          Up     Normal  12.68 MB        0.00% 
>>   1                                           
>> 192.168.2.2     us-east     1b          Up     Normal  12.56 MB        
>> 50.00%  85070591730234615865843651857942052864      
>> 192.168.2.7     us-west     1c          Up     Normal  25.48 MB        0.00% 
>>   85070591730234615865843651857942052865      
>> 
>> Thanks in advance.
>> CM
> 
>

Re: Local Quorum Performance...

Reply via email to