Kurt is right.

So here are the options I can think of :
- use the join_ring false technique and rely on hints. You'll need to
disable the native transport on the node as well to prevent direct
connections to be made to it. Hopefully, you can run repair in less than 3
hours which is the hint window (hints will be collected while the node
hasn't joined the ring). Otherwise you'll have more consistency issues
after the node joins the ring again. Maybe incremental repair could help
fixing this quickly afterwards if you've been running full repairs that
involved anticompaction (if you're running at least Cassandra 2.2).
- Fully re-bootstrap the node by replacing itself, using the
replace_address_first_boot technique (but since you have RF=2, that would
most probably mean some data loss since you read/write at ONE)
- Try to cheat the dynamic snitch to take the node out of reads. You would
then have the node join the ring normally, disable native transport and
raise Severity (in org.apache.cassandra.db:type=DynamicEndpointSnitch) to
something like 50 so the node won't be selected by the dynamic snitch. I
guess the value will reset itself over time so you may need to set it to 50
on a regular basis while repair is happening.

I would then strongly consider moving to RF=3 because RF=2 will lead you to
this type of situation again in the future and does not allow quorum reads
with fault tolerance.

Good luck,

On Wed, Aug 29, 2018 at 1:56 PM Vlad <qa23d-...@yahoo.com.invalid> wrote:

> I restarted with cassandra.join_ring=false
> nodetool status on other nodes shows this node as DN, while it see itself
> as UN.
>
>
> >I'd say best to just query at QUORUM until you can finish repairs.
> We have RH 2, so I guess QUORUM queries will fail. Also different
> application should be changed for this.
>
>
> On Wednesday, August 29, 2018 2:41 PM, kurt greaves <k...@instaclustr.com>
> wrote:
>
>
> Note that you'll miss incoming writes if you do that, so you'll be
> inconsistent even after the repair. I'd say best to just query at QUORUM
> until you can finish repairs.
>
> On 29 August 2018 at 21:22, Alexander Dejanovski <a...@thelastpickle.com>
> wrote:
>
> Hi Vlad, you must restart the node but first disable joining the cluster,
> as described in the second part of this blog post :
> http://thelastpickle.com/blog/ 2018/08/02/Re-Bootstrapping-
> Without-Bootstrapping.html
> <http://thelastpickle.com/blog/2018/08/02/Re-Bootstrapping-Without-Bootstrapping.html>
>
> Once repaired, you'll have to run "nodetool join" to start serving reads.
>
>
> Le mer. 29 août 2018 à 12:40, Vlad <qa23d-...@yahoo.com.invalid> a écrit :
>
> Will it help to set read_repair_chance to 1 (compaction is
> SizeTieredCompactionStrategy)?
>
>
> On Wednesday, August 29, 2018 1:34 PM, Vlad <qa23d-...@yahoo.com.INVALID>
> wrote:
>
>
> Hi,
>
> quite urgent questions:
> due to disk and C* start problem we were forced to delete commit logs from
> one of nodes.
>
> Now repair is running, but meanwhile some reads bring no data (RF=2)
>
> Can this node be excluded from reads queries? And that  all reads will be
> redirected to other node in the ring?
>
>
> Thanks to All for help.
>
>
> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to