Re: Repair hangs, seems to be stuck somehow

Robert Coli Mon, 20 Oct 2014 10:33:13 -0700

On Mon, Oct 20, 2014 at 5:45 AM, Alain RODRIGUEZ <[email protected]> wrote:


> Using Cassandra 1.2.18, we are experimenting an issue in our 2 DC
> (EC2MultiRegionSnitch) C*1.2.18 cluster.
>
> We have 2 DC and I saw some weird* inconsistencies between our 2 DC. I
> tried to run repair on all the nodes of all 2 DC (We tried running various
> repair at the same time and also in a rolling repair way, also tried with
> and without -pr options). It spends days (last run started 3 days ago on
> various machines), It seems to hang since I can't see any validation
> compaction or any streams running. Though, I don't see any error either...
> The CF I am trying to run right now is 350 MB large (per node), I am quite
> sure it shouldn't take that long... Repairing other CF get also stuck.
>

This is long standing issue which is unfortunately becoming a FAQ. Yes,
repair is broken in all versions of Cassandra up to at least 2.0.10,
hopefully the latest streaming rewrite will finally fix it. If you are
really overprovisioned and on real hardware and network and SSD, it might
work sometimes.

Here's some related JIRA...

https://issues.apache.org/jira/browse/CASSANDRA-3486 - nodetool command to
stop repair
https://issues.apache.org/jira/browse/CASSANDRA-7904 - good entry point
into web of 2.0 era repair bugs

and last but not least the inappropriately hostile but accurate...
https://issues.apache.org/jira/browse/CASSANDRA-5396

=Rob

Re: Repair hangs, seems to be stuck somehow

Reply via email to