Hi,

I'm trying to run a repair on a node my Cassandra cluster, version 3.7, and
was hoping someone may be able to shed light on an error message that keeps
cropping up.

I started the repair on a node after discovering that it somehow became
partitioned from the rest of the cluster, e.g. nodetool status on all other
nodes showed it as DN, and on the node itself showed all other nodes as DN.
After restarting the Cassandra daemon the node seemed to re-join the
cluster just fine, so I began a repair.

The repair has been running for about 33 hours (first incremental repair on
this cluster), and every so often I'll see a line like this:

[2017-08-31 00:18:16,300] Repair session
f7ae4e71-8ce3-11e7-b466-79eba0383e4f for range
[(-5606588017314999649,-5604469721630340065],
(9047587767449433379,9047652965163017217]] failed with error Endpoint /
20.0.122.204 died (progress: 9%)

Every one of these lines refers to the same node, 20.0.122.204.

I'm mostly looking for guidance here. Do these errors indicate that the
entire repair will be worthless, or just for token ranges shared by these
two nodes? Is it normal to see error messages of this nature and for a
repair not to terminate?

Thanks,
Paul

Reply via email to