Hi Evan, The clients connect to all nodes. We tried shutting the thrift server on the affected node. Loads did not come down.
On Fri, Nov 1, 2013 at 12:59 AM, Evan Weaver <e...@fauna.org> wrote: > Are all your clients only connecting to your first node? I would > probably strace it and compare the trace to one from a lightly loaded > node. > > On Thu, Oct 31, 2013 at 7:12 PM, Ashish Tyagi <tyagi.i...@gmail.com> > wrote: > > We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in > the > > other. All machines are Amazon M1.XLarge configuration. > > > > Datacenter: DC1 > > ========== > > Address Rack Status State Load Owns > > Token > > > > ip11 1b Up Normal 76.46 GB 16.67% 0 > > ip12 1b Up Normal 44.66 GB 16.67% > > 28356863910078205288614550619314017621 > > ip13 1c Up Normal 85.94 GB 16.67% > > 56713727820156410577229101238628035241 > > ip14 1c Up Normal 17.55 GB 16.67% > > 85070591730234615865843651857942052863 > > ip15 1d Up Normal 80.74 GB 16.67% > > 113427455640312821154458202477256070484 > > ip16 1d Up Normal 20.88 GB 16.67% > > 141784319550391026443072753096570088105 > > > > Datacenter: DC2 > > ========== > > Address Rack Status State Load Owns > > Token > > > > ip21 1a Up Normal 78.32 GB 0.00% 1001 > > ip22 1b Up Normal 71.23 GB 0.00% > > 56713727820156410577229101238628036241 > > ip23 1b Up Normal 53.49 GB 0.00% > > 113427455640312821154458202477256071484 > > > > Problem is that node with ip address: ip11 often has 5-10 times more load > > than any other node. Most of the operations are on counters. The primary > > column family (which receives most writes) has a replication factor of 2 > in > > DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy > (reads > > are less than 10% of total requests). We are using size-tiered > compaction. > > Both writes and reads happen with a consistency factor of LOCAL_QUORUM. > > > > More information: > > > > 1. cassandra.yaml - http://pastebin.com/u344fA6z > > 2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa > > 3. Nodetool tpstats - http://pastebin.com/s0AS7bGd > > 4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx > > 5. GC log lines - http://pastebin.com/Y0TKphsm > > > > Am I doing anything wrong. Any pointers will be appreciated. > > > > Thanks in advance, > > Ashish >