On Mon, Oct 20, 2014 at 5:45 AM, Alain RODRIGUEZ <[email protected]> wrote:
> Using Cassandra 1.2.18, we are experimenting an issue in our 2 DC > (EC2MultiRegionSnitch) C*1.2.18 cluster. > > We have 2 DC and I saw some weird* inconsistencies between our 2 DC. I > tried to run repair on all the nodes of all 2 DC (We tried running various > repair at the same time and also in a rolling repair way, also tried with > and without -pr options). It spends days (last run started 3 days ago on > various machines), It seems to hang since I can't see any validation > compaction or any streams running. Though, I don't see any error either... > The CF I am trying to run right now is 350 MB large (per node), I am quite > sure it shouldn't take that long... Repairing other CF get also stuck. > This is long standing issue which is unfortunately becoming a FAQ. Yes, repair is broken in all versions of Cassandra up to at least 2.0.10, hopefully the latest streaming rewrite will finally fix it. If you are really overprovisioned and on real hardware and network and SSD, it might work sometimes. Here's some related JIRA... https://issues.apache.org/jira/browse/CASSANDRA-3486 - nodetool command to stop repair https://issues.apache.org/jira/browse/CASSANDRA-7904 - good entry point into web of 2.0 era repair bugs and last but not least the inappropriately hostile but accurate... https://issues.apache.org/jira/browse/CASSANDRA-5396 =Rob
