The risk is you violate consistency while you run repair Assume you have three replicas for that range, a b c
At some point b misses a write, but it’s committed on a and c for quorum Now c has a corrupt sstable You empty c and bring it back with no data and start repair Then the app reads at quorum and selects b and c You don’t see the data when you do a quorum read - this is technically incorrect You could: Stop the host with corrupt sstable Run repair on the token range impacted using just the surviving hosts (this makes sure the two survivors have all of the data) Clear all the data for that table on the host for the corrupt sstable (I think you can leave the commitlog in place but you probably want to flush and drain before you stop the host) Then bring that host up and run repair I think that’s strictly safe and you’re just rebuilding 5-6gb > On Feb 13, 2020, at 11:23 PM, manish khandelwal > <manishkhandelwa...@gmail.com> wrote: > > > Thanks Jeff for your response. > > Do you see any risk in following approach > > 1. Stop the node. > 2. Remove all sstable files from > /var/lib/cassandra/data/keyspace/tablename-23dfadf32a3333df33d33s333s33s3s33 > directory. > 3. Start the node. > 4. Run full repair on this particular table > > I wanted to go this way because this table is small (5-6 GB). I would like > to avoid 2-3 days of streaming in case of replacing the whole host. > > Regards > Manish > >> On Fri, Feb 14, 2020 at 12:28 PM Jeff Jirsa <jji...@gmail.com> wrote: >> Agree this is both strictly possible and more common with LCS. The only >> thing that's strictly correct to do is treat every corrupt sstable exception >> as a failed host, and replace it just like you would a failed host. >> >> >>> On Thu, Feb 13, 2020 at 10:55 PM manish khandelwal >>> <manishkhandelwa...@gmail.com> wrote: >>> Thanks Erick >>> >>> I would like to explain how data resurrection can take place with single >>> SSTable deletion. >>> >>> Consider this case of table with Levelled Compaction Strategy >>> >>> 1. Data A written a long time back. >>> 2. Data A is deleted and tombstone is created. >>> 3. After GC grace tombstone is purgeable. >>> 4. Now the SSTable containing purgeable tombstone in one node is corruputed. >>> 4. The node with corrupt SSTable cannot compact the data and purgeable >>> tombstone >>> 6. From other two nodes Data A is removed after compaction. >>> 7. Remove the corrupt SSTable from impacted node. >>> 8. When you run repair Data A is copied to all the nodes. >>> >>> This table in quesiton is using Levelled Compaction Strategy. >>> >>> Regards >>> Manish >>> >>>> On Fri, Feb 14, 2020 at 12:00 PM Erick Ramirez >>>> <erick.rami...@datastax.com> wrote: >>>> The log shows that the the problem occurs when decompressing the SSTable >>>> but there's not much actionable info from it. >>>> >>>>> I would like to know what will be "ordinary hammer" in this case. Do you >>>>> want to suggest that deleting only corrupt sstable file ( in this case >>>>> mc-1234-big-*.db) would be suffice ? >>>> >>>> Exactly. I mean if it's just a one-off, why go through the trouble of >>>> blowing away all the files? :) >>>> >>>>> I am afraid that this may cause data resurrection (I have prior >>>>> experience with same). >>>> >>>> Whoa! That's a long bow to draw. Sounds like there's more history to it. >>>> >>>>> Note that i am not willing to run the entire node rebuild as it will take >>>>> lots of time due to presence of multiple big tables (I am keeping it as >>>>> my last option) >>>> >>>> >>>> I wasn't going to suggest that at all. I didn't like the sledge hammer >>>> approach. I certainly wouldn't recommend bringing in a wrecking ball. 😁 >>>> >>>> Cheers!