Yeah, I meant the down node can’t participate in repairs
Regards, Nitan Cell: 510 449 9629 > On May 27, 2020, at 2:09 PM, Leon Zaruvinsky <leonzaruvin...@gmail.com> wrote: > > > Yep, Jeff is right, the intention would be to run a repair limited to the > available nodes. > >> On Wed, May 27, 2020 at 2:59 PM Jeff Jirsa <jji...@gmail.com> wrote: >> The "-hosts " flag tells cassandra to only compare trees/run repair on the >> hosts you specify, so if you have 3 replicas, but 1 replica is down, you can >> provide -hosts with the other two, and it will make sure those two are in >> sync (via merkle trees, etc), but ignore the third. >> >> >> >>> On Wed, May 27, 2020 at 10:45 AM Nitan Kainth <nitankai...@gmail.com> wrote: >>> Jeff, >>> >>> If Cassandra is down how will it generate merkle tree to compare? >>> >>> >>> Regards, >>> Nitan >>> Cell: 510 449 9629 >>> >>>>> On May 27, 2020, at 11:15 AM, Jeff Jirsa <jji...@gmail.com> wrote: >>>>> >>>> >>>> You definitely can repair with a node down by passing `-hosts >>>> specific_hosts` >>>> >>>>> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth <nitankai...@gmail.com> >>>>> wrote: >>>>> I didn't get you Leon, >>>>> >>>>> But, the simple thing is just to follow the steps and you will be fine. >>>>> You can't run the repair if the node is down. >>>>> >>>>>> On Wed, May 27, 2020 at 10:34 AM Leon Zaruvinsky >>>>>> <leonzaruvin...@gmail.com> wrote: >>>>>> Hey Jeff/Nitan, >>>>>> >>>>>> 1) this concern should not be a problem if the repair happens before the >>>>>> corrupted node is brought back online, right? >>>>>> 2) in this case, is option (3) equivalent to replacing the node? where >>>>>> we repair the two live nodes and then bring up the third node with no >>>>>> data >>>>>> >>>>>> Leon >>>>>> >>>>>>> On Tue, May 26, 2020 at 10:11 PM Jeff Jirsa <jji...@gmail.com> wrote: >>>>>>> There’s two problems with this approach if you need strict correctness >>>>>>> >>>>>>> 1) after you delete the sstable and before you repair you’ll violate >>>>>>> consistency, so you’ll potentially serve incorrect data for a while >>>>>>> >>>>>>> 2) The sstable May have a tombstone past gc grace that’s shadowing data >>>>>>> in another sstable that’s not corrupt and deleting it may resurrect >>>>>>> that deleted data. >>>>>>> >>>>>>> The only strictly safe thing to do here, unfortunately, is to treat the >>>>>>> host as failed and rebuild it from it’s neighbors (and again being >>>>>>> pedantic here, that means stop the host, while it’s stopped repair the >>>>>>> surviving replicas, then bootstrap a replacement on top of the same >>>>>>> tokens) >>>>>>> >>>>>>> >>>>>>> >>>>>>> > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky >>>>>>> > <leonzaruvin...@gmail.com> wrote: >>>>>>> > >>>>>>> > >>>>>>> > Hi all, >>>>>>> > >>>>>>> > I'm looking to understand Cassandra's behavior in an sstable >>>>>>> > corruption scenario, and what the minimum amount of work is that >>>>>>> > needs to be done to remove a bad sstable file. >>>>>>> > >>>>>>> > Consider: 3 node, RF 3 cluster, reads/writes at quorum >>>>>>> > SStable corruption exception on one node at >>>>>>> > keyspace1/table1/lb-1-big-Data.db >>>>>>> > Sstablescrub does not work. >>>>>>> > >>>>>>> > Is it safest to, after running a repair on the two live nodes, >>>>>>> > 1) Delete only keyspace1/table1/lb-1-big-Data.db, >>>>>>> > 2) Delete all files associated with that sstable (i.e., >>>>>>> > keyspace1/table1/lb-1-*), >>>>>>> > 3) Delete all files under keyspace1/table1/, or >>>>>>> > 4) Any of the above are the same from a correctness perspective. >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Leon >>>>>>> > >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>>>>