Hello Allan, > Does data rebalancing occur when a node leaves or joins, or only when you manually change the baseline topology (assuming automatic baseline adjustment is disabled)? Again, this is on a cluster with persistence enabled. Yes, this can happen when a node joins the cluster, for instance. Let's consider the following scenario: you shut down a node that is a part of the current baseline topology, and so, this node cannot apply updates. After a while, this node was restarted and returned to the cluster. In this case, rebalancing can be triggered in order to transfer that updates.
> 2. Sometimes I look at the partition counts of a cache across all the nodes using Arrays.stream(ignite.affinity(cacheName).primaryPartitions(severNode) and I see 0 partition > After a while it returns to a balanced state. What's going on here? Well, when a partition needs to be rebalanced from one node (supplier) to another one (demander) we create a partition on the demander in a MOVING state (this means a backup that applies updates but cannot be used for reads). When this partition fully rebalanced it is switched to OWNING state and the next PME (Late Affinity Assignment) may mark this partition as a primary. > 3. Is there a way to manually invoke the partition map exchange process? I don't think so. > 4. Sometimes I see 'partition lost' errors. If i am using persistence and all the baseline nodes are online and connected, is it safe to assume no data has been lost and just call cache.resetLostPartitions(myCaches)? If I am not mistaken, the answer is yes. Please take a look at https://ignite.apache.org/docs/latest/configuring-caches/partition-loss-policy#recovering-from-a-partition-loss Thanks, S. пт, 29 янв. 2021 г. в 15:56, Alan Ward <arw...@gmail.com>: > I'm using Ignite 2.9.1, a 5 node cluster with persistence enabled, > partitioned caches with 1 backup. > > I'm a bit confused about the difference between data rebalancing and > partition map exchange in this context. > > 1. Does data rebalancing occur when a node leaves or joins, or only when > you manually change the baseline topology (assuming automatic baseline > adjustment is disabled)? Again, this is on a cluster with persistence > enabled. > > 2. Sometimes I look at the partition counts of a cache across all the > nodes using > Arrays.stream(ignite.affinity(cacheName).primaryPartitions(severNode) and I > see 0 partitions on one or even two nodes for some of the caches. After a > while it returns to a balanced state. What's going on here? Is this data > rebalancing at work, or is this the result of the partition map exchange > process determining that one node is/was down and thus switching to use the > backup partitions? > > 3. Is there a way to manually invoke the partition map exchange process? I > figured it would happen on cluster restart, but even after restarting the > cluster and seeing all baseline nodes connect I still observe the partition > imbalance. It often takes hours for this to resolve. > > 4. Sometimes I see 'partition lost' errors. If i am using persistence and > all the baseline nodes are online and connected, is it safe to assume no > data has been lost and just call cache.resetLostPartitions(myCaches)? Is > there a way calling that method would lead to data loss with persistence > enabled? > > thanks for your help! >