Hi Micha,

Are you running incremental repair ?
If so, then validation fails when 2 repair sessions are running at the same
time, with one anticompacting an SSTable and the other trying to run a
validation compaction on it.

If you check the logs of the nodes that is referred to in the "Validation
failed in ...", you should see that there are error messages stating that
an sstable can't be part of 2 different repair sessions.

If that happens (and you're indeed running incremental repair), you should
roll restart the cluster to stop all repairs and then process one node at a
time only.
Reaper does that, but you can handle it manually if you prefer. The plan
here is to wait for all anticompactions to be over before starting a repair
on the next node.

In any case, check the logs of the node that failed to run validation
compaction in order to understand what failed.

Cheers,

On Thu, Sep 14, 2017 at 10:18 AM Micha <mich...@fantasymail.de> wrote:

> Hi,
>
> I started a repair (7 nodes, C* 3.11) but at once I get an exception in
> the log:
> "RepairException: [#.... on keyspace/table, [....],
>  Validation failed in /ip"
>
> The started nodetool repair hangs (the whole day...), strace shows it's
> waiting...
>
> What's the reason for this excpetion and what to do now? If this is an
> error, why doesn't nodetool abort the command and shows the error?
>
> thanks,
>  Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to