It's a good idea to increase phi_convict_threshold to at least 12 on EC2.
Using placement groups and single-tenant systems will certainly help.

Another optimization would be dedicating an Enhanced Network Interface (
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html)
specifically for gossip traffic.


On Mon, May 19, 2014 at 1:36 PM, Phil Burress <philburress...@gmail.com>wrote:

> Has anyone experienced network i/o issues with ec2? We are seeing a lot of
> these in our logs:
>
> HintedHandOffManager.java (line 477) Timed out replaying hints to
> /10.0.x.xxx; aborting (15 delivered)
>
> and these...
>
> Cannot handshake version with /10.0.x.xxx
>
> and these...
>
> java.io.IOException: Cannot proceed on repair because a neighbor
> (/10.0.x.xxx) is dead: session failed
>
> Occurs on all of our nodes. Even though in all cases, the host that is
> being reported as down or unavailable is up and readily 'pingable'.
>
> We are using shared tenancy on all our nodes (instance type m1.xlarge)
> with cassandra 2.0.7. Any suggestions on how to debug these errors?
>
> Is there a recommendation to move to Placement Groups for Cassandra?
>
> Thanks!
>
> Phil
>



-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to