Just to clarify, does "adding node" include initiating a repair for the cluster? Or you are simply bootstrapping a new node, nothing else?
— Sent from Mailbox On Sat, Oct 25, 2014 at 2:38 PM, null <aiva...@iponweb.net> wrote: > Dear all, > So, here is our setup so far: > - Ubuntu 12.04 > - Cassandra 2.0.10, JDK 1.7.0_65-b17 > - 6 nodes (EC2 c3.8xlarge/ 32 cores/60GB RAM, EBS disks for data, > ephemeral SSD for commit logs etc) > - pretty heavy write load - 100Ks/second > - RF=2, one dc, 2 racks > - everything works just fine with low CPU consumption - load average tends > to be around 4-10 > Now, we are trying to add a node. This cases a heavy load on existing nodes - > like over 100 load average. The cluster becomes unresponsive, writes and > reads mostly fail. > The weird observations are that: > - without adding new node CPU is low > - if we turn off writes while adding a new node load average on existing > nodes drops back to 4-10 and the new node just fine > I've checked VisualVM sampling and basically all the CPU on existing nodes is > consumed by org.jboss.netty.channel.socket.nio.SelectorUtil.select(). > What we tried so far: > - throttling streaming - no impact > - disabling internode compression - no impact > - disabling autocompaction on existing nodes - no impact > - even running with -Dorg.jboss.netty.epollBugWorkaround=true - no impact > And as of now we are somewhat desperate as this behavior is a blocker for us > - we can't afford losing writes and we will need to expand C* dynamically. > Anyone has encountered something similar? Any ideas/hints? Thanks