Also once you've got your phi_convict_threshold sorted, if you see these again 
check:

http://status.aws.amazon.com/ 

AWS does occasionally have the odd increased latency issue / outage. 

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359


On 19/05/2014, at 1:15 PM, Nate McCall <n...@thelastpickle.com> wrote:

> It's a good idea to increase phi_convict_threshold to at least 12 on EC2. 
> Using placement groups and single-tenant systems will certainly help.
> 
> Another optimization would be dedicating an Enhanced Network Interface 
> (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html) 
> specifically for gossip traffic. 
> 
> 
> On Mon, May 19, 2014 at 1:36 PM, Phil Burress <philburress...@gmail.com> 
> wrote:
> Has anyone experienced network i/o issues with ec2? We are seeing a lot of 
> these in our logs:
> 
> HintedHandOffManager.java (line 477) Timed out replaying hints to 
> /10.0.x.xxx; aborting (15 delivered)
> 
> and these...
> 
> Cannot handshake version with /10.0.x.xxx
> 
> and these...
> 
> java.io.IOException: Cannot proceed on repair because a neighbor 
> (/10.0.x.xxx) is dead: session failed
> 
> Occurs on all of our nodes. Even though in all cases, the host that is being 
> reported as down or unavailable is up and readily 'pingable'.
> 
> We are using shared tenancy on all our nodes (instance type m1.xlarge) with 
> cassandra 2.0.7. Any suggestions on how to debug these errors?
> 
> Is there a recommendation to move to Placement Groups for Cassandra?
> 
> Thanks!
> 
> Phil 
> 
> 
> 
> -- 
> -----------------
> Nate McCall
> Austin, TX
> @zznate
> 
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com

Reply via email to