Forgot to mention: All 9 nodes on Cassandra 1.2.9. Also, tpstats on the high CPU node indicate:
1. Pool Name Active Pending Completed Blocked All time blocked 2. ReadStage 32 6600 3420385815 0 0 3. RequestResponseStage 0 0 2094235864 0 0 4. MutationStage 0 0 3102461222 0 0 5. ReadRepairStage 0 0 438089 0 0 6. *ReplicateOnWriteStage 0 0 253180440 0 23703996* 7. GossipStage 0 0 5917301 0 0 8. AntiEntropyStage 0 0 1486 0 0 9. MigrationStage 0 0 143 0 0 10. MemtablePostFlusher 0 0 39070 0 0 11. FlushWriter 0 0 7452 0 927 12. MiscStage 0 0 257 0 0 13. commitlog_archiver 0 0 0 0 0 14. AntiEntropySessions 0 0 1 0 0 15. InternalResponseStage 0 0 62 0 0 16. HintedHandoff 0 0 1961 0 0 17. 18. Message type Dropped 19. RANGE_SLICE 1681 20. READ_REPAIR 3921 21. BINARY 0 22. READ 4103953 23. MUTATION 2651071 24. _TRACE 0 25. REQUEST_RESPONSE 3229 On Fri, Nov 1, 2013 at 3:37 PM, Rakesh Rajan <rakes...@gmail.com> wrote: > @Tyler / @Rob, > > As Ashish mentioned earlier, we have 9 nodes on AWS - 6 on EastCoast and 3 > on Singapore. All 9 nodes uses EC2Snitch. The current ring ( across all > nodes in 2 DC ) looks like this: > > ip11 - East Coast - m1.xlarge / us-east-1b - Size: 83 GB - Token: > 0 > ip21 - Singapore - m1.xlarge / ap-southeast-1a - Size: 88 GB - Token: > 1001 > ip12 - East Coast - m1.xlarge / us-east-1b - Size: 45 GB - > Token: 28356863910078205288614550619314017621 > ip13 - East Coast - m1.xlarge / us-east-1c - Size: 93 GB - > Token: 56713727820156410577229101238628035241 > ip22 - Singapore - m1.xlarge / ap-southeast-1b - Size: 73 GB - > Token: 56713727820156410577229101238628036241 > ip14 - East Coast - m1.xlarge / us-east-1c - Size: 20 GB - > Token: 85070591730234615865843651857942052863 > ip15 - East Coast - m1.xlarge / us-east-1d - Size: 89 GB - > Token: 113427455640312821154458202477256070484 > ip23 - Singapore - m1.xlarge / ap-southeast-1b - Size: 56 GB - > Token: 113427455640312821154458202477256071484 > ip16 - East Coast - m1.xlarge / us-east-1d - Size: 25 GB - > Token: 141784319550391026443072753096570088105 > > Regarding alternating racks solution, I've the following queries: > > 1) By alternating racks, do you mean to alternate racks between all nodes > in a single DC v/s multiple DCs? AWS EastCoast has 4 AZs > and Singapore has 2 AZs. So is the final solution something like this: > ip11 - East Coast - m1.xlarge / us-east-1b - Token: 0 > ip21 - Singapore - m1.xlarge / ap-southeast-1a - Token: 1001 > ip12 - East Coast - m1.xlarge / us-east-*1c* - > Token: 28356863910078205288614550619314017621 > ip13 - East Coast - m1.xlarge / us-east-*1d* - > Token: 56713727820156410577229101238628035241 > ip22 - Singapore - m1.xlarge / ap-southeast-1b - > Token: 56713727820156410577229101238628036241 > ip14 - East Coast - m1.xlarge / us-east-*1a* - > Token: 85070591730234615865843651857942052863 > ip15 - East Coast - m1.xlarge / us-east-*1b* - > Token: 113427455640312821154458202477256070484 > ip23 - Singapore - m1.xlarge / ap-southeast-*1a* - > Token: 113427455640312821154458202477256071484 > ip16 - East Coast - m1.xlarge / us-east-*1c* - > Token: 141784319550391026443072753096570088105 > > Is this what you had suggested? > > 2) How does dynamic_snitch_badness_threshold: 0.1 effect the CPU load? On > the node ( ip11 ) which was high CPU ( system load > 30 ), I checked the > attribute score ( via JMX > bean org.apache.cassandra.db:type=DynamicEndpointSnitch ) and saw the > following: > EastCoast: > *ip11 = 1.6813321647677475* > ip12 = 1.0003505696757231 > ip13 = 1.1324160525509974 > ip14 = 1.000350569675723 > ip15 = 1.0007011393514456 > ip16 = 1.0005258545135842 > Singapore: > ip21 = 1.095880806310253 > ip22 = 1.4100000000000001 > ip23 = 1.0953549517966696 > > So ip11 node is indeed having higher score - but not sure why traffic is > still going to that replica as opposed to some other node? > > Thanks! > > > > On Fri, Nov 1, 2013 at 3:13 PM, Ashish Tyagi <tyagi.i...@gmail.com> wrote: > >> Hi Evan, >> >> The clients connect to all nodes. We tried shutting the thrift server on >> the affected node. Loads did not come down. >> >> >> >> On Fri, Nov 1, 2013 at 12:59 AM, Evan Weaver <e...@fauna.org> wrote: >> >>> Are all your clients only connecting to your first node? I would >>> probably strace it and compare the trace to one from a lightly loaded >>> node. >>> >>> On Thu, Oct 31, 2013 at 7:12 PM, Ashish Tyagi <tyagi.i...@gmail.com> >>> wrote: >>> > We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes >>> in the >>> > other. All machines are Amazon M1.XLarge configuration. >>> > >>> > Datacenter: DC1 >>> > ========== >>> > Address Rack Status State Load Owns >>> > Token >>> > >>> > ip11 1b Up Normal 76.46 GB 16.67% 0 >>> > ip12 1b Up Normal 44.66 GB 16.67% >>> > 28356863910078205288614550619314017621 >>> > ip13 1c Up Normal 85.94 GB 16.67% >>> > 56713727820156410577229101238628035241 >>> > ip14 1c Up Normal 17.55 GB 16.67% >>> > 85070591730234615865843651857942052863 >>> > ip15 1d Up Normal 80.74 GB 16.67% >>> > 113427455640312821154458202477256070484 >>> > ip16 1d Up Normal 20.88 GB 16.67% >>> > 141784319550391026443072753096570088105 >>> > >>> > Datacenter: DC2 >>> > ========== >>> > Address Rack Status State Load Owns >>> > Token >>> > >>> > ip21 1a Up Normal 78.32 GB 0.00% >>> 1001 >>> > ip22 1b Up Normal 71.23 GB 0.00% >>> > 56713727820156410577229101238628036241 >>> > ip23 1b Up Normal 53.49 GB 0.00% >>> > 113427455640312821154458202477256071484 >>> > >>> > Problem is that node with ip address: ip11 often has 5-10 times more >>> load >>> > than any other node. Most of the operations are on counters. The >>> primary >>> > column family (which receives most writes) has a replication factor of >>> 2 in >>> > DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy >>> (reads >>> > are less than 10% of total requests). We are using size-tiered >>> compaction. >>> > Both writes and reads happen with a consistency factor of LOCAL_QUORUM. >>> > >>> > More information: >>> > >>> > 1. cassandra.yaml - http://pastebin.com/u344fA6z >>> > 2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa >>> > 3. Nodetool tpstats - http://pastebin.com/s0AS7bGd >>> > 4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx >>> > 5. GC log lines - http://pastebin.com/Y0TKphsm >>> > >>> > Am I doing anything wrong. Any pointers will be appreciated. >>> > >>> > Thanks in advance, >>> > Ashish >>> >> >> >