Re: Is deleting live sstable safe in this scenario?

Nitan Kainth Wed, 27 May 2020 12:40:24 -0700

Yeah, I meant the down node can’t participate in repairs


Regards,
Nitan
Cell: 510 449 9629

> On May 27, 2020, at 2:09 PM, Leon Zaruvinsky <leonzaruvin...@gmail.com> wrote:
> 
> 
> Yep, Jeff is right, the intention would be to run a repair limited to the 
> available nodes.
> 
>> On Wed, May 27, 2020 at 2:59 PM Jeff Jirsa <jji...@gmail.com> wrote:
>> The "-hosts " flag tells cassandra to only compare trees/run repair on the 
>> hosts you specify, so if you have 3 replicas, but 1 replica is down, you can 
>> provide -hosts with the other two, and it will make sure those two are in 
>> sync (via merkle trees, etc), but ignore the third.
>> 
>> 
>> 
>>> On Wed, May 27, 2020 at 10:45 AM Nitan Kainth <nitankai...@gmail.com> wrote:
>>> Jeff,
>>> 
>>> If Cassandra is down how will it generate merkle tree to compare?
>>> 
>>> 
>>> Regards,
>>> Nitan
>>> Cell: 510 449 9629
>>> 
>>>>> On May 27, 2020, at 11:15 AM, Jeff Jirsa <jji...@gmail.com> wrote:
>>>>> 
>>>> 
>>>> You definitely can repair with a node down by passing `-hosts 
>>>> specific_hosts`
>>>> 
>>>>> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth <nitankai...@gmail.com> 
>>>>> wrote:
>>>>> I didn't get you Leon,
>>>>> 
>>>>> But, the simple thing is just to follow the steps and you will be fine. 
>>>>> You can't run the repair if the node is down.
>>>>> 
>>>>>> On Wed, May 27, 2020 at 10:34 AM Leon Zaruvinsky 
>>>>>> <leonzaruvin...@gmail.com> wrote:
>>>>>> Hey Jeff/Nitan,
>>>>>> 
>>>>>> 1) this concern should not be a problem if the repair happens before the 
>>>>>> corrupted node is brought back online, right?
>>>>>> 2) in this case, is option (3) equivalent to replacing the node? where 
>>>>>> we repair the two live nodes and then bring up the third node with no 
>>>>>> data
>>>>>> 
>>>>>> Leon
>>>>>> 
>>>>>>> On Tue, May 26, 2020 at 10:11 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>>>>> There’s two problems with this approach if you need strict correctness 
>>>>>>> 
>>>>>>> 1) after you delete the sstable and before you repair you’ll violate 
>>>>>>> consistency, so you’ll potentially serve incorrect data for a while
>>>>>>> 
>>>>>>> 2) The sstable May have a tombstone past gc grace that’s shadowing data 
>>>>>>> in another sstable that’s not corrupt and deleting it may resurrect 
>>>>>>> that deleted data. 
>>>>>>> 
>>>>>>> The only strictly safe thing to do here, unfortunately, is to treat the 
>>>>>>> host as failed and rebuild it from it’s neighbors (and again being 
>>>>>>> pedantic here, that means stop the host, while it’s stopped repair the 
>>>>>>> surviving replicas, then bootstrap a replacement on top of the same 
>>>>>>> tokens)
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky 
>>>>>>> > <leonzaruvin...@gmail.com> wrote:
>>>>>>> > 
>>>>>>> > 
>>>>>>> > Hi all,
>>>>>>> > 
>>>>>>> > I'm looking to understand Cassandra's behavior in an sstable 
>>>>>>> > corruption scenario, and what the minimum amount of work is that 
>>>>>>> > needs to be done to remove a bad sstable file.
>>>>>>> > 
>>>>>>> > Consider: 3 node, RF 3 cluster, reads/writes at quorum
>>>>>>> > SStable corruption exception on one node at 
>>>>>>> > keyspace1/table1/lb-1-big-Data.db
>>>>>>> > Sstablescrub does not work.
>>>>>>> > 
>>>>>>> > Is it safest to, after running a repair on the two live nodes,
>>>>>>> > 1) Delete only keyspace1/table1/lb-1-big-Data.db,
>>>>>>> > 2) Delete all files associated with that sstable (i.e., 
>>>>>>> > keyspace1/table1/lb-1-*),
>>>>>>> > 3) Delete all files under keyspace1/table1/, or
>>>>>>> > 4) Any of the above are the same from a correctness perspective.
>>>>>>> > 
>>>>>>> > Thanks,
>>>>>>> > Leon
>>>>>>> > 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>>>>

Re: Is deleting live sstable safe in this scenario?

Reply via email to