Anthony, We used the Ec2Snitch for one sets of runs, but for another set we're using PropertyFileSnitch.
With the PropertyFileSnitch we see: Address DC Rack Status State Load Owns Token 85070591730234615865843651857942052865 192.168.2.1 us-east 1b Up Normal 60.59 MB 50.00% 0 192.168.2.6 us-west 1c Up Normal 26.5 MB 0.00% 1 192.168.2.2 us-east 1b Up Normal 29.86 MB 50.00% 85070591730234615865843651857942052864 192.168.2.7 us-west 1c Up Normal 60.63 MB 0.00% 85070591730234615865843651857942052865 While with the EC2Snitch wwe see: Address DC Rack Status State Load Owns Token 85070591730234615865843651857942052865 107.20.68.176 us-east 1b Up Normal 59.95 MB 50.00% 0 204.236.179.193 us-west 1c Up Normal 53.67 MB 0.00% 1 184.73.133.171 us-east 1b Up Normal 60.65 MB 50.00% 85070591730234615865843651857942052864 204.236.166.4 us-west 1c Up Normal 26.33 MB 0.00% 85070591730234615865843651857942052865 What also strange is that the Load on the nodes changes as well. For example, node 204.236.166.4 sometimes is very low (~26KB), other times its closer to 30MB. We see the same kind of variability in both clusters. For both clusters, we're running stress tests with the following options: --consistency-level=LOCAL_QUORUM --threads=4 --replication-strategy=NetworkTopologyStrategy --strategy-properties=us-east:2,us-west:2 --column-size=128 --keep-going --num-keys=100000 -r Any clues to what is going on here are greatly appreciated. Thanks CM On Sat, Sep 17, 2011 at 12:15 PM, Ikeda Anthony <anthony.ikeda....@gmail.com > wrote: > What snitch do you have configured? We typically see a proper spread of > data across all our nodes equally. > > Anthony > > > On 17/09/2011, at 10:06 AM, Chris Marino wrote: > > Hi, I have a question about what to expect when running a cluster across > datacenters with Local Quorum consistency. > > My simplistic assumption is that the performance of an 8 node cluster split > across 2 data centers and running with local quorum would perform roughly > the same as a 4 node cluster in one data center. > > I'm 95% certain we've set up the keyspace so that the entire range is in > one datacenter and the client is local. I see the keyspace split across all > the local nodes, with remote nodes owning 0%. Yet when I run the stress > tests against this configuration with local quorum, I see dramatically > different results from when I ran the same tests against a 4 node cluster. > I'm still 5% unsure of this because the documentation on how to configure > this is pretty thin. > > My understanding of Local Quorum was that once the data was written to a > local quorum, the commit would complete. I also believed that this would > eliminate any WAN latency required for replication to the other DC. > > It not just that the split cluster runs slower, its also that there is > enormous variability in identical tests. Sometimes by a factor of 2 or more. > It seems as though the WAN latency is not only impacting performance, but > that it's also introducing a wide variation on overally performance. > > Should WAN latency be completely hidden with local quorum? Or are there > second order issues involved that will impact performance?? > > I'm running in EC2 across us-east/west regions. I already know how > unpredictable EC2 performance can be, but what I'm seeing with here is far > beyond normal.performance variability for EC2 > > Is there something obvious that I'm missing that would explain why the > results are so different?? > > Here's the config when we run a 2x2 cluster: > > Address DC Rack Status State Load Owns > Token > > 85070591730234615865843651857942052865 > 192.168.2.1 us-east 1b Up Normal 25.26 MB 50.00% > 0 > 192.168.2.6 us-west 1c Up Normal 12.68 MB 0.00% > 1 > 192.168.2.2 us-east 1b Up Normal 12.56 MB 50.00% > 85070591730234615865843651857942052864 > 192.168.2.7 us-west 1c Up Normal 25.48 MB 0.00% > 85070591730234615865843651857942052865 > > > Thanks in advance. > CM > > >