On 2025/03/01 00:27:27 Jaydeep Chovatia wrote:
> Hi,
> 
> I want to reattach an asynchronously replicated EBS volume to Cassandra. I
> want to know how to fix the delta inconsistency when reattaching other than
> running a repair on the dataset.
> 
> Here is the scenario.
> Three Cassandra nodes in three separate zones:
> Node1 --> Zone1 (*EBS_Drive1*, Async_EBS_Drive3_Replica)
> Node2 --> Zone2 (EBS_Drive2, *Async_EBS_Drive1_Replica*)
> Node3 --> Zone3 (EBS_Drive3, Async_EBS_Drive2_Replica)
> 
> EBS replicates data between Zones asynchronously. EBS Drive1 in Zone1 is
> asynchronously copied to EBS Drive1 in Zone2, and so on.
> 
> If Node1 goes down in Zone1, I want to reattach Node1's asynchronously
> replicated drive, *Async_EBS_Drive1_Replica,* in Zone2, which is fine. But
> this async drive would be missing some of the latest data, say the last 15
> minutes, which was present in EBS_Drive1. Besides going through Cassandra
> repair, what are my options to repair the missing data when I reattach
> *Async_EBS_Drive1_Replica*?

There is no way to do this with JUST Cassandra in 2025-available Versions of 
Cassandra.

You're effectively asking for a point in time restore functionality from a 
backup system that doesnt implement point-in-time restore capability. You'd 
need the delta commitlogs and newly flushed sstables, and you'd have to replay 
them. Hints wont work, because you've acked the mutations. IR isn't guaranteed 
to work, because all replicas may have already promoted the unrepaired set on 
the sync ebs volume. 

(If the nodes are ALSO running their own cassandra processes from the primary 
zone, I suspect you also end up with out of range data, which is not great, and 
I have no idea how future enhancements like CEP-45 would think about this - or 
point-in-time restore in general).


Reply via email to