Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

kurt Greaves Thu, 13 Oct 2016 21:37:19 -0700

Don't do pr repairs when using incremental repair, you'll just end up with
loads of anti-compactions.


On 12 October 2016 at 19:11, Harikrishnan Pillai <hpil...@walmartlabs.com>
wrote:

> In my experience dc local repair node by node with
> Pr and par options is best .full repair increased sstables
> A lot and take days to compact it back or another
> Easy option for repair is use a spark job ,read all data with
> Consistency all and increase read repair chance to
> 100 % or use Netflix tickler
>
> Sent from my iPhone
>
> On Oct 12, 2016, at 11:44 AM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
>
> Hi Leena,
>
> First thing you should be concerned about is : Why the repair -pr
> operation doesnt complete ?
> Second comes the question : Which repair option is best?
>
>
> One probable cause of stuck repairs is : if the firewall between DCs is
> closing TCP connections and Cassandra is trying to use such connections,
> repairs will hang. Please refer https://docs.datastax.com/en/
> cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html . We
> faced that.
>
> Also make sure you comply with basic bandwidth requirement between DCs.
> Recommended is 1000 Mb/s (1 gigabit) or greater.
>
> Answers for specific questions:
> 1.As per my understanding, all replicas will not participate in dc local
> repairs and thus repair would be ineffective. You need to make sure that
> all replicas of a data in all dcs are in sync.
>
> 2. Every DC is not a ring. All DCs together form a token ring. So, I think
> yes you should run repair -pr on all nodes.
>
> 3. Yes. I dont have experience with incremental repairs. But you can run
> repair -pr on all nodes of all DCs.
>
> Regarding Best approach of repair, you should see some repair
> presentations of Cassandra Summit 2016. All are online now.
>
> I attended the summit and people using large clusters generally use sub
> range repairs to repair their clusters. But such large deployments are on
> older Cassandra versions and these deployments generally dont use vnodes.
> So people know easily which nodes hold which token range.
>
>
>
> Thanks
> Anuj
>
> ------------------------------
> *From: *Leena Ghatpande <lghatpa...@hotmail.com>;
> *To: *user@cassandra.apache.org <user@cassandra.apache.org>;
> *Subject: *Repair in Multi Datacenter - Should you use -dc Datacenter
> repair or repair with -pr
> *Sent: *Wed, Oct 12, 2016 2:15:51 PM
>
> Please advice. Cannot find any clear documentation on what is the best
> strategy for repairing nodes on a regular basis with multiple datacenters
> involved.
>
>
> We are running cassandra 3.7 in multi datacenter with 4 nodes in each data
> center. We are trying to run repairs every other night to keep the nodes in
> good state.We currently run repair with -pr option , but the repair process
> gets hung and does not complete gracefully. Dont see any errors in the logs
> either.
>
>
> What is the best way to perform repairs on multiple data centers on large
> tables.
>
> 1. Can we run Datacenter repair using -dc option for each data center? Do
> we need to run repair on each node in that case or will it repair all nodes
> within the datacenter?
>
> 2. Is running repair with -pr across all nodes required , if we perform
> the step 1 every night?
>
> 3. Is cross data center repair required and if so whats the best option?
>
>
> Thanks
>
>
> Leena
>
>
>
>


-- 
Kurt Greaves
k...@instaclustr.com
www.instaclustr.com

Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

Reply via email to