Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

Harikrishnan Pillai Wed, 12 Oct 2016 12:12:02 -0700

In my experience dc local repair node by node with
Pr and par options is best .full repair increased sstables
A lot and take days to compact it back or another
Easy option for repair is use a spark job ,read all data with
Consistency all and increase read repair chance to
100 % or use Netflix tickler


Sent from my iPhone

On Oct 12, 2016, at 11:44 AM, Anuj Wadehra 
<[email protected]<mailto:[email protected]>> wrote:

Hi Leena,

First thing you should be concerned about is : Why the repair -pr operation 
doesnt complete ?
Second comes the question : Which repair option is best?


One probable cause of stuck repairs is : if the firewall between DCs is closing 
TCP connections and Cassandra is trying to use such connections, repairs will 
hang. Please refer 
https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 . We faced that.

Also make sure you comply with basic bandwidth requirement between DCs. 
Recommended is 1000 Mb/s (1 gigabit) or greater.

Answers for specific questions:
1.As per my understanding, all replicas will not participate in dc local 
repairs and thus repair would be ineffective. You need to make sure that all 
replicas of a data in all dcs are in sync.

2. Every DC is not a ring. All DCs together form a token ring. So, I think yes 
you should run repair -pr on all nodes.

3. Yes. I dont have experience with incremental repairs. But you can run repair 
-pr on all nodes of all DCs.

Regarding Best approach of repair, you should see some repair presentations of 
Cassandra Summit 2016. All are online now.

I attended the summit and people using large clusters generally use sub range 
repairs to repair their clusters. But such large deployments are on older 
Cassandra versions and these deployments generally dont use vnodes. So people 
know easily which nodes hold which token range.



Thanks
Anuj


________________________________
From: Leena Ghatpande <[email protected]<mailto:[email protected]>>;
To: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>;
Subject: Repair in Multi Datacenter - Should you use -dc Datacenter repair or 
repair with -pr
Sent: Wed, Oct 12, 2016 2:15:51 PM


Please advice. Cannot find any clear documentation on what is the best strategy 
for repairing nodes on a regular basis with multiple datacenters involved.


We are running cassandra 3.7 in multi datacenter with 4 nodes in each data 
center. We are trying to run repairs every other night to keep the nodes in 
good state.We currently run repair with -pr option , but the repair process 
gets hung and does not complete gracefully. Dont see any errors in the logs 
either.


What is the best way to perform repairs on multiple data centers on large 
tables.

1. Can we run Datacenter repair using -dc option for each data center? Do we 
need to run repair on each node in that case or will it repair all nodes within 
the datacenter?

2. Is running repair with -pr across all nodes required , if we perform the 
step 1 every night?

3. Is cross data center repair required and if so whats the best option?


Thanks


Leena

Re: Repair in Multi Datacenter - Should you use -dc Datacenter repair or repair with -pr

Reply via email to