Re: [cassandra 3.6.] Nodetool Repair + tombstone behaviour

Alexander Dejanovski Thu, 29 Sep 2016 07:16:07 -0700

Atul,

our fork has been tested on 2.1 and 3.0.x clusters.
I've just tested with a CCM 3.6 cluster and it worked with no issue.


With Reaper, if you set incremental to false, it'll perform a full subrange
repair with no anticompaction.
You'll see this message in the logs : INFO  [AntiEntropyStage:1] 2016-09-29
16:11:34,950 ActiveRepairService.java:378 - Not a global repair, will not
do anticompaction

If you set incremental to true, it'll perform an incremental repair, one
node at a time, with anticompaction (set Parallelism to Parallel
exclusively with inc repair).

Let me know how it goes.


On Thu, Sep 29, 2016 at 3:06 PM Atul Saroha <atul.sar...@snapdeal.com>
wrote:

> Hi Alexander,
>
> There is compatibility issue raised with spotify/cassandra-reaper for
> cassandra version 3.x.
> Is it comaptible with 3.6 in fork thelastpickle/cassandra-reaper ?
>
> There are some suggestions mentioned by *brstgt* which we can try on our
> side.
>
> On Thu, Sep 29, 2016 at 5:42 PM, Atul Saroha <atul.sar...@snapdeal.com>
> wrote:
>
>> Thanks Alexander.
>>
>> Will look into all these.
>>
>> On Thu, Sep 29, 2016 at 4:39 PM, Alexander Dejanovski <
>> a...@thelastpickle.com> wrote:
>>
>>> Atul,
>>>
>>> since you're using 3.6, by default you're running incremental repair,
>>> which doesn't like concurrency very much.
>>> Validation errors are not occurring on a partition or partition range
>>> base, but if you're trying to run both anticompaction and validation
>>> compaction on the same SSTable.
>>>
>>> Like advised to Robert yesterday, and if you want to keep on running
>>> incremental repair, I'd suggest the following :
>>>
>>>    - run nodetool tpstats on all nodes in search for running/pending
>>>    repair sessions
>>>    - If you have some, and to be sure you will avoid conflicts, roll
>>>    restart your cluster (all nodes)
>>>    - Then, run "nodetool repair" on one node.
>>>    - When repair has finished on this node (track messages in the log
>>>    and nodetool tpstats), check if other nodes are running anticompactions
>>>    - If so, wait until they are over
>>>    - If not, move on to the other node
>>>
>>> You should be able to run concurrent incremental compactions on
>>> different tables if you wish to speed up the complete repair of the
>>> cluster, but do not try to repair the same table/full keyspace from two
>>> nodes at the same time.
>>>
>>> If you do not want to keep on using incremental repair, and fallback to
>>> classic full repair, I think the only way in 3.6 to avoid anticompaction
>>> will be to use subrange repair (Paulo mentioned that in 3.x full repair
>>> also triggers anticompaction).
>>>
>>> You have two options here : cassandra_range_repair (
>>> https://github.com/BrianGallew/cassandra_range_repair) and Spotify
>>> Reaper (https://github.com/spotify/cassandra-reaper)
>>>
>>> cassandra_range_repair might scream about subrange + incremental not
>>> being compatible (not sure here), but you can modify the repair_range()
>>> method by adding a --full switch to the command line used to run repair.
>>>
>>> We have a fork of Reaper that handles both full subrange repair and
>>> incremental repair here :
>>> https://github.com/thelastpickle/cassandra-reaper
>>> It comes with a tweaked version of the UI made by Stephan Podkowinski (
>>> https://github.com/spodkowinski/cassandra-reaper-ui) - that eases
>>> interactions to schedule, run and track repair - which adds fields to run
>>> incremental repair (accessible via ...:8080/webui/ in your browser).
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On Thu, Sep 29, 2016 at 12:33 PM Atul Saroha <atul.sar...@snapdeal.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are not sure whether this issue is linked to that node or not. Our
>>>> application does frequent delete and insert.
>>>>
>>>> May be our approach is not correct for nodetool repair. Yes, we
>>>> generally fire repair on all boxes at same time. Till now, it was manual
>>>> with default configuration ( command: "nodetool repair").
>>>> Yes, we saw validation error but that is linked to already running
>>>> repair of  same partition on other box for same partition range. We saw
>>>> error validation failed with some ip as repair in already running for the
>>>> same SSTable.
>>>> Just few days back, we had 2 DCs with 3 nodes each and replication was
>>>> also 3. It means all data on each node.
>>>>
>>>> On Thu, Sep 29, 2016 at 2:49 PM, Alexander Dejanovski <
>>>> a...@thelastpickle.com> wrote:
>>>>
>>>>> Hi Atul,
>>>>>
>>>>> could you be more specific on how you are running repair ? What's the
>>>>> precise command line for that, does it run on several nodes at the same
>>>>> time, etc...
>>>>> What is your gc_grace_seconds ?
>>>>> Do you see errors in your logs that would be linked to repairs
>>>>> (Validation failure or failure to create a merkle tree)?
>>>>>
>>>>> You seem to mention a single node that went down but say the whole
>>>>> cluster seem to have zombie data.
>>>>> What is the connection you see between the node that went down and the
>>>>> fact that deleted data comes back to life ?
>>>>> What is your strategy for cyclic maintenance repair (schedule, command
>>>>> line or tool, etc...) ?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Thu, Sep 29, 2016 at 10:40 AM Atul Saroha <atul.sar...@snapdeal.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We have seen a weird behaviour in cassandra 3.6.
>>>>>> Once our node was went down more than 10 hrs. After that, we had ran
>>>>>> Nodetool repair multiple times. But tombstone are not getting sync 
>>>>>> properly
>>>>>> over the cluster. On day- today basis, on expiry of every grace period,
>>>>>> deleted records start surfacing again in cassandra.
>>>>>>
>>>>>> It seems Nodetool repair in not syncing tomebstone across cluster.
>>>>>> FYI, we have 3 data centres now.
>>>>>>
>>>>>> Just want the help how to verify and debug this issue. Help will be
>>>>>> appreciated.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Atul Saroha
>>>>>>
>>>>>> *Lead Software Engineer | CAMS*
>>>>>>
>>>>>> M: +91 8447784271
>>>>>> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
>>>>>> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>>>>>>
>>>>>> --
>>>>> -----------------
>>>>> Alexander Dejanovski
>>>>> France
>>>>> @alexanderdeja
>>>>>
>>>>> Consultant
>>>>> Apache Cassandra Consulting
>>>>> http://www.thelastpickle.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Atul Saroha
>>>>
>>>> *Lead Software Engineer | CAMS*
>>>>
>>>> M: +91 8447784271
>>>> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
>>>> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>>>>
>>>> --
>>> -----------------
>>> Alexander Dejanovski
>>> France
>>> @alexanderdeja
>>>
>>> Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>
>>
>>
>> --
>> Regards,
>> Atul Saroha
>>
>> *Lead Software Engineer | CAMS*
>>
>> M: +91 8447784271
>> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
>> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>>
>>
>
>
> --
> Regards,
> Atul Saroha
>
> *Lead Software Engineer | CAMS*
>
> M: +91 8447784271
> Plot #362, ASF Center - Tower A, 1st Floor, Sec-18,
> Udyog Vihar Phase IV,Gurgaon, Haryana, India
>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: [cassandra 3.6.] Nodetool Repair + tombstone behaviour

Reply via email to