Re: Is deleting live sstable safe in this scenario?

Leon Zaruvinsky Wed, 27 May 2020 12:10:09 -0700

Yep, Jeff is right, the intention would be to run a repair limited to the
available nodes.


On Wed, May 27, 2020 at 2:59 PM Jeff Jirsa <jji...@gmail.com> wrote:

> The "-hosts " flag tells cassandra to only compare trees/run repair on the
> hosts you specify, so if you have 3 replicas, but 1 replica is down, you
> can provide -hosts with the other two, and it will make sure those two are
> in sync (via merkle trees, etc), but ignore the third.
>
>
>
> On Wed, May 27, 2020 at 10:45 AM Nitan Kainth <nitankai...@gmail.com>
> wrote:
>
>> Jeff,
>>
>> If Cassandra is down how will it generate merkle tree to compare?
>>
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>> On May 27, 2020, at 11:15 AM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>> 
>> You definitely can repair with a node down by passing `-hosts
>> specific_hosts`
>>
>> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth <nitankai...@gmail.com>
>> wrote:
>>
>>> I didn't get you Leon,
>>>
>>> But, the simple thing is just to follow the steps and you will be fine.
>>> You can't run the repair if the node is down.
>>>
>>> On Wed, May 27, 2020 at 10:34 AM Leon Zaruvinsky <
>>> leonzaruvin...@gmail.com> wrote:
>>>
>>>> Hey Jeff/Nitan,
>>>>
>>>> 1) this concern should not be a problem if the repair happens before
>>>> the corrupted node is brought back online, right?
>>>> 2) in this case, is option (3) equivalent to replacing the node? where
>>>> we repair the two live nodes and then bring up the third node with no data
>>>>
>>>> Leon
>>>>
>>>> On Tue, May 26, 2020 at 10:11 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>>
>>>>> There’s two problems with this approach if you need strict correctness
>>>>>
>>>>> 1) after you delete the sstable and before you repair you’ll violate
>>>>> consistency, so you’ll potentially serve incorrect data for a while
>>>>>
>>>>> 2) The sstable May have a tombstone past gc grace that’s shadowing
>>>>> data in another sstable that’s not corrupt and deleting it may resurrect
>>>>> that deleted data.
>>>>>
>>>>> The only strictly safe thing to do here, unfortunately, is to treat
>>>>> the host as failed and rebuild it from it’s neighbors (and again being
>>>>> pedantic here, that means stop the host, while it’s stopped repair the
>>>>> surviving replicas, then bootstrap a replacement on top of the same 
>>>>> tokens)
>>>>>
>>>>>
>>>>>
>>>>> > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky <
>>>>> leonzaruvin...@gmail.com> wrote:
>>>>> >
>>>>> > 
>>>>> > Hi all,
>>>>> >
>>>>> > I'm looking to understand Cassandra's behavior in an sstable
>>>>> corruption scenario, and what the minimum amount of work is that needs to
>>>>> be done to remove a bad sstable file.
>>>>> >
>>>>> > Consider: 3 node, RF 3 cluster, reads/writes at quorum
>>>>> > SStable corruption exception on one node at
>>>>> keyspace1/table1/lb-1-big-Data.db
>>>>> > Sstablescrub does not work.
>>>>> >
>>>>> > Is it safest to, after running a repair on the two live nodes,
>>>>> > 1) Delete only keyspace1/table1/lb-1-big-Data.db,
>>>>> > 2) Delete all files associated with that sstable (i.e.,
>>>>> keyspace1/table1/lb-1-*),
>>>>> > 3) Delete all files under keyspace1/table1/, or
>>>>> > 4) Any of the above are the same from a correctness perspective.
>>>>> >
>>>>> > Thanks,
>>>>> > Leon
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>>
>>>>>

Re: Is deleting live sstable safe in this scenario?

Reply via email to