It looks to me that can indeed happen theoretically (I might be wrong). However,
- Hinted Handoff tends to remove this issue, if this is big worry, you might want to make sure HH are enabled and well tuned - Read Repairs (synchronous or not) might have mitigate things also, if you read fresh data. You can set this to higher values. - After an outage, you should always run a nodetool repair on the node that went done - following the best practices, or because you understand the reasons - or just trust HH if it is enough to you. So I would say that you can always "shoot yourself in your foot", whatever you do, yet following best practices or understanding the internals is the key imho. I would say it is a good question though. Alain. 2015-06-24 19:43 GMT+02:00 Anuj Wadehra <anujw_2...@yahoo.co.in>: > Hi, > > We faced a scenario where we lost little data after adding 2 nodes in the > cluster. There were intermittent dropped mutations in the cluster. Need to > verify my understanding how this may have happened to do Root Cause > Analysis: > > Scenario: 3 nodes, RF=3, Read / Write CL= Quorum > > 1. Due to overloaded cluster, some writes just happened on 2 nodes: node 1 > & node 2 whike asynchronous mutations dropped on node 3. > So say key K with Token T was not written to 3. > > 2. I added node 4 and suppose as per newly calculated ranges, now token T > is supposed to have replicas on node 1, node 3, and node 4. Unfortunately > node 4 started bootstrapping from node 3 where key K was missing. > > 3. After 2 min gap recommended, I added node 5 and as per new token > distribution suppose token T now is suppossed to have replicas on node 3, > node 4 and node 5. Again node 5 bootstrapped from node 3 where data was > misssing. > > So now key K is lost and thats how we list very few rows. > > Moreover, in step 1 situation could be worse. we can also have a scenario > where some writes just happened on one of three replicas and cassandra > chooses replicas where this data is missing for streaming ranges to 2 new > nodes. > > Am I making sense? > > We are using C* 2.0.3. > > Thanks > Anuj > > > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android> >