Pavel Kovalenko created IGNITE-10799:
----------------------------------------

             Summary: Optimize affinity initialization/re-calculation
                 Key: IGNITE-10799
                 URL: https://issues.apache.org/jira/browse/IGNITE-10799
             Project: Ignite
          Issue Type: Improvement
          Components: cache
    Affects Versions: 2.1
            Reporter: Pavel Kovalenko
            Assignee: Pavel Kovalenko
             Fix For: 2.8


In case of persistence enabled and a baseline is set we have 2 main approaches 
to recalculate affinity:

{noformat}
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager#onServerJoinWithExchangeMergeProtocol
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager#onServerLeftWithExchangeMergeProtocol
{noformat}

Both of them following the same approach of recalculating:
1) Take a current baseline (ideal assignment).
2) Filter out offline nodes from it.
3) Choose new primary nodes if previous went away.
4) Place temporal primary nodes to late affinity assignment set.

Looking at implementation details we may notice that we do a lot of unnecessary 
online nodes cache lookups and array list copies. The performance becomes too 
slow if we do recalculate affinity for replicated caches (It takes P * N on 
each node, where P - partitions count, N - the number of nodes in the cluster). 
In case of large partitions count or large cluster, it may take few seconds, 
which is unacceptable, because this process happens during PME and freezes 
ongoing cluster operations.

We should investigate possible bottlenecks and improve the performance of 
affinity recalculation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to