Yep, Jeff is right, the intention would be to run a repair limited to the available nodes.
On Wed, May 27, 2020 at 2:59 PM Jeff Jirsa <jji...@gmail.com> wrote: > The "-hosts " flag tells cassandra to only compare trees/run repair on the > hosts you specify, so if you have 3 replicas, but 1 replica is down, you > can provide -hosts with the other two, and it will make sure those two are > in sync (via merkle trees, etc), but ignore the third. > > > > On Wed, May 27, 2020 at 10:45 AM Nitan Kainth <nitankai...@gmail.com> > wrote: > >> Jeff, >> >> If Cassandra is down how will it generate merkle tree to compare? >> >> >> Regards, >> >> Nitan >> >> Cell: 510 449 9629 >> >> On May 27, 2020, at 11:15 AM, Jeff Jirsa <jji...@gmail.com> wrote: >> >> >> You definitely can repair with a node down by passing `-hosts >> specific_hosts` >> >> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth <nitankai...@gmail.com> >> wrote: >> >>> I didn't get you Leon, >>> >>> But, the simple thing is just to follow the steps and you will be fine. >>> You can't run the repair if the node is down. >>> >>> On Wed, May 27, 2020 at 10:34 AM Leon Zaruvinsky < >>> leonzaruvin...@gmail.com> wrote: >>> >>>> Hey Jeff/Nitan, >>>> >>>> 1) this concern should not be a problem if the repair happens before >>>> the corrupted node is brought back online, right? >>>> 2) in this case, is option (3) equivalent to replacing the node? where >>>> we repair the two live nodes and then bring up the third node with no data >>>> >>>> Leon >>>> >>>> On Tue, May 26, 2020 at 10:11 PM Jeff Jirsa <jji...@gmail.com> wrote: >>>> >>>>> There’s two problems with this approach if you need strict correctness >>>>> >>>>> 1) after you delete the sstable and before you repair you’ll violate >>>>> consistency, so you’ll potentially serve incorrect data for a while >>>>> >>>>> 2) The sstable May have a tombstone past gc grace that’s shadowing >>>>> data in another sstable that’s not corrupt and deleting it may resurrect >>>>> that deleted data. >>>>> >>>>> The only strictly safe thing to do here, unfortunately, is to treat >>>>> the host as failed and rebuild it from it’s neighbors (and again being >>>>> pedantic here, that means stop the host, while it’s stopped repair the >>>>> surviving replicas, then bootstrap a replacement on top of the same >>>>> tokens) >>>>> >>>>> >>>>> >>>>> > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky < >>>>> leonzaruvin...@gmail.com> wrote: >>>>> > >>>>> > >>>>> > Hi all, >>>>> > >>>>> > I'm looking to understand Cassandra's behavior in an sstable >>>>> corruption scenario, and what the minimum amount of work is that needs to >>>>> be done to remove a bad sstable file. >>>>> > >>>>> > Consider: 3 node, RF 3 cluster, reads/writes at quorum >>>>> > SStable corruption exception on one node at >>>>> keyspace1/table1/lb-1-big-Data.db >>>>> > Sstablescrub does not work. >>>>> > >>>>> > Is it safest to, after running a repair on the two live nodes, >>>>> > 1) Delete only keyspace1/table1/lb-1-big-Data.db, >>>>> > 2) Delete all files associated with that sstable (i.e., >>>>> keyspace1/table1/lb-1-*), >>>>> > 3) Delete all files under keyspace1/table1/, or >>>>> > 4) Any of the above are the same from a correctness perspective. >>>>> > >>>>> > Thanks, >>>>> > Leon >>>>> > >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>> >>>>>