Hi Alisher, At first I assume the discussion need to be moved to user-forum ( u...@ignite.apache.org).
If you have 2 backups (in case you have enough nodes) this mean you can loose 2 node, without afraid to loose data. In case you have a fail more than the 2 nodes, it is mean you have incomplete(only part of data has available) data. In general, If you are loosing more nodes than backups you are have, this must be signal: "These inconsistent!". Also you can divided your nodes by specification: group of backups and group of primary, using RendezvousAffinityFunction#setAffinityBackupFilter. On Wed, Nov 2, 2016 at 11:11 AM, Alisher Alimov <alimovalis...@gmail.com> wrote: > Hi! > > I have a question about data consistency in cluster, if there are any > mechanism for checking that cache is in consistency state (no lost > data/partitions) > > For example I have a cluster with N nodes, long compute job that calculate > monthly revenue on huge amount of data. Data is transaction log that stored > in cache “transactions" <UUID, Float> (where key is transaction id, value > transaction amount). Cache backups = 2. > > First case. > > 1. We load huge amount of data in cache "transactions" > 2. All is fine > 3. Run simple compute job that sum values in transaction log ( Float::sum > ) > 4. Compute job run on all nodes except n1,n2 > 5. We lost n1 node, than n2 node that stored backup of n1 node > 6. Ignite found that node n1 and n2 are down and rebalancing data > 7. Compute job was completed > > Second case. > > 1. We load huge amount of data in cache "transactions" > 2. All is fine > 3. We lost n1 node, then n2 node that stored backup of n1 node > 4. Ignite found that node n1 and n2 are down and rebalancing data > 5. Run simple compute job that sum values in transaction log ( (a,b) -> > a+b ) > > So questions: > - what result we can retrieve in first case from compute job? > - at second case we run compute job on cache that is not in consistency > state (partitions that belongs to n1, n2 nodes was lost) but Ignite cache > will work fine and allow us to run ComputeJob on cache and doesn’t tell us > that some data was lost, isn’t it? > > > With best regards > Alisher Alimov > alimovalis...@gmail.com > > > > >