Re: Partition range incremental repairs

Chris Stokesmore Mon, 19 Jun 2017 02:51:03 -0700

Anyone have anymore thoughts on this at all? Struggling to understand it..


> On 9 Jun 2017, at 11:32, Chris Stokesmore <chris.elsm...@demandlogic.co> 
> wrote:
> 
> Hi Anuj,
> 
> Thanks for the reply.
> 
> 1). We are using Cassandra 2.2.8, and our repair commands we are comparing 
> are 
> "nodetool repair --in-local-dc --partitioner-range” and 
> "nodetool repair --in-local-dc”
> Since 2.2 I believe inc repairs are the default - that seems to be confirmed 
> in the logs that list the repair details when a repair starts.
> 
> 2) From looks at a few runsr, on average:
> with -pr repairs, each node is approx 6.5 - 8 hours, so a total over the 7 
> nodes of 53 hours
> With just inc repairs, each node ~26 - 29 hours, so a total of 193
> 
> 3) we currently have two DCs in total, the ‘production’ ring with 7 nodes and 
> RF=3, and a testing ring with one single node and RF=1 for our single 
> keyspace we currently use.
> 
> 4) Yeah that number came from the Cassandra repair logs from an inc repair, I 
> can share the number reports when using a pr repair later this evening when 
> the currently running repair has completed.
> 
> 
> Many thanks for the reply again,
> 
> Chris
> 
> 
>> On 6 Jun 2017, at 17:50, Anuj Wadehra <anujw_2...@yahoo.co.in 
>> <mailto:anujw_2...@yahoo.co.in>> wrote:
>> 
>> Hi Chris,
>> 
>> Can your share following info:
>> 
>> 1. Exact repair commands you use for inc repair and pr repair
>> 
>> 2. Repair time should be measured at cluster level for inc repair. So, whats 
>> the total time it takes to run repair on all nodes for incremental vs pr 
>> repairs?
>> 
>> 3. You are repairing one dc DC3. How many DCs are there in total and whats 
>> the RF for keyspaces? Running pr on a specific dc would not repair entire 
>> data.
>> 
>> 4. 885 ranges? From where did you get this number? Logs? Can you share the 
>> number ranges printed in logs for both inc and pr case?
>> 
>> 
>> Thanks
>> Anuj
>> 
>> 
>> Sent from Yahoo Mail on Android 
>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>> On Tue, Jun 6, 2017 at 9:33 PM, Chris Stokesmore
>> <chris.elsm...@demandlogic.co <mailto:chris.elsm...@demandlogic.co>> wrote:
>> Thank you for the excellent and clear description of the different versions 
>> of repair Anuj, that has cleared up what I expect to be happening.
>> 
>> The problem now is in our cluster, we are running repairs with options 
>> (parallelism: parallel, primary range: false, incremental: true, job 
>> threads: 1, ColumnFamilies: [], dataCenters: [DC3], hosts: [], # of ranges: 
>> 885) and when we do our repairs are taking over a day to complete when 
>> previously when running with the partition range option they were taking 
>> more like 8-9 hours.
>> 
>> As I understand it, using incremental should have sped this process up as 
>> all three sets of data on each repair job should be marked as repaired 
>> however this does not seem to be the case. Any ideas?
>> 
>> Chris
>> 
>>> On 6 Jun 2017, at 16:08, Anuj Wadehra <anujw_2...@yahoo.co.in.INVALID 
>>> <mailto:anujw_2...@yahoo.co.in.INVALID>> wrote:
>>> 
>>> Hi Chris,
>>> 
>>> Using pr with incremental repairs does not make sense. Primary range repair 
>>> is an optimization over full repair. If you run full repair on a n node 
>>> cluster with RF=3, you would be repairing each data thrice. 
>>> E.g. in a 5 node cluster with RF=3, a range may exist on node A,B and C . 
>>> When full repair is run on node A, the entire data in that range gets 
>>> synced with replicas on node B and C. Now, when you run full repair on 
>>> nodes B and C, you are wasting resources on repairing data which is already 
>>> repaired. 
>>> 
>>> Primary range repair ensures that when you run repair on a node, it ONLY 
>>> repairs the data which is owned by the node. Thus, no node repairs data 
>>> which is not owned by it and must be repaired by other node. Redundant work 
>>> is eliminated. 
>>> 
>>> Even in pr, each time you run pr on all nodes, you repair 100% of data. Why 
>>> to repair complete data in each cycle?? ..even data which has not even 
>>> changed since the last repair cycle?
>>> 
>>> This is where Incremental repair comes as an improvement. Once repaired, a 
>>> data would be marked repaired so that the next repair cycle could just 
>>> focus on repairing the delta. Now, lets go back to the example of 5 node 
>>> cluster with RF =3.This time we run incremental repair on all nodes. When 
>>> you repair entire data on node A, all 3 replicas are marked as repaired. 
>>> Even if you run inc repair on all ranges on the second node, you would not 
>>> re-repair the already repaired data. Thus, there is no advantage of 
>>> repairing only the data owned by the node (primary range of the node). You 
>>> can run inc repair on all the data present on a node and Cassandra would 
>>> make sure that when you repair data on other nodes, you only repair 
>>> unrepaired data.
>>> 
>>> Thanks
>>> Anuj
>>> 
>>> 
>>> 
>>> Sent from Yahoo Mail on Android 
>>> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>>> On Tue, Jun 6, 2017 at 4:27 PM, Chris Stokesmore
>>> <chris.elsm...@demandlogic.co <mailto:chris.elsm...@demandlogic.co>> wrote:
>>> Hi all,
>>> 
>>> Wondering if anyone had any thoughts on this? At the moment the long 
>>> running repairs cause us to be running them on two nodes at once for a bit 
>>> of time, which obivould increases the cluster load.
>>> 
>>> On 2017-05-25 16:18 (+0100), Chris Stokesmore <c...@demandlogic.co 
>>> <mailto:c...@demandlogic.co>> wrote: 
>>> > Hi,> 
>>> > 
>>> > We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been 
>>> > running repairs with the -pr option, via a cron job that runs on each 
>>> > node once per week.> 
>>> > 
>>> > We changed that as some advice on the Cassandra IRC channel said it would 
>>> > cause more anticompaction and  
>>> > http://docs.datastax.com/en/archived/cassandra/2.2/cassandra/tools/toolsRepair.html
>>> >   
>>> > <http://docs.datastax.com/en/archived/cassandra/2.2/cassandra/tools/toolsRepair.html>says
>>> >  'Performing partitioner range repairs by using the -pr option is 
>>> > generally considered a good choice for doing manual repairs. However, 
>>> > this option cannot be used with incremental repairs (default for 
>>> > Cassandra 2.2 and later)'
>>> > 
>>> > Only problem is our -pr repairs were taking about 8 hours, and now the 
>>> > non-pr repair are taking 24+ - I guess this makes sense, repairing 1/7 of 
>>> > data increased to 3/7, except I was hoping to see a speed up after the 
>>> > first loop through the cluster as each repair will be marking much more 
>>> > data as repaired, right?> 
>>> > 
>>> > 
>>> > Is running -pr with incremental repairs really that bad? > 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>>> <mailto:user-unsubscr...@cassandra.apache.org>
>>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>>> <mailto:user-h...@cassandra.apache.org>
>> 
>

Re: Partition range incremental repairs

Reply via email to