Hello Marcus, It looks like a bug. I will check and file a jira issue.
> I suppose Ignite should be fault tolerant and outage of some nodes should not cause shutdown of other nodes. You are absolutely right. Thanks, S. ср, 6 апр. 2022 г. в 10:49, Lo, Marcus <marcus...@citi.com>: > Hi Ignite team, > > > > Can you please advise if there are anything that we can check on the > below? Thanks. > > > > Regards, > > Marcus > > > > *From:* Lo, Marcus [ICG-IT] > *Sent:* Wednesday, March 30, 2022 11:55 AM > *To:* user > *Subject:* Node crashed with error "Getting affinity for too old topology > version that is already out of history" > > > > Hi, > > > > We are using Ignite 2.10.0 and have 5 nodes (with consistentId/hostname - > lrdeqprmap01p, lrdeqprmap02p, lrdeqprmap03p, lcgeqprmap03c, lcgeqprmap04c) > in the cluster, and at one time 2 of the nodes (lcgeqprmap03c, > lcgeqprmap04c) were out due to power outage. Somehow another node > lrdeqprmap03p shut down shortly after that, with the following error: > > > > 2022-03-29 14:32:01.996+0100 ERROR > [query-#194160%Prism%] : Critical > system error detected. Will be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], > failureCtx=FailureContext [type=CRITICAL_ERROR, err=class > o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is > corrupted [pages(groupId, pageId)=[], cacheId=388652627, > cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure > on bounds: [lower=null, upper=null]]]] > > org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: > B+Tree is corrupted [pages(groupId, pageId)=[], cacheId=388652627, > cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure > on bounds: [lower=null, upper=null]] > > at > org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:977) > ~[ignite-indexing-2.10.0.jar:2.10.0] > > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1133) > ~[ignite-core-2.10.0.jar:2.10.0] > > at > org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.find(H2TreeIndex.java:415) > ~[ignite-indexing-2.10.0.jar:2.10.0] > > … > > Caused by: > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException: > java.lang.IllegalStateException: Getting affinity for too old topology > version that is already out of history [locNode=TcpDiscoveryNode > [id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, > addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet [/ > 127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], > discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, > loc=true, ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], > grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, > minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, > minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], > history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], > AffinityTopologyVersion [topVer=19, minorTopVer=0]]] > > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1079) > ~[ignite-core-2.10.0.jar:2.10.0] > > at > org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1118) > ~[ignite-core-2.10.0.jar:2.10.0] > > ... 23 more > > Caused by: java.lang.IllegalStateException: Getting affinity for too old > topology version that is already out of history [locNode=TcpDiscoveryNode > [id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, > addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet [/ > 127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], > discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, > loc=true, ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], > grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, > minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, > minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], > history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], > AffinityTopologyVersion [topVer=19, minorTopVer=0]]] > > at > org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:831) > ~[ignite-core-2.10.0.jar:2.10.0] > > at > org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:778) > ~[ignite-core-2.10.0.jar:2.10.0] > > at > org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.nodes(GridAffinityAssignmentCache.java:686) > ~[ignite-core-2.10.0.jar:2.10.0] > > ... > > > > I suppose Ignite should be fault tolerant and outage of some nodes should > not cause shutdown of other nodes. Can you please advise? I have attached > the full log for your reference. Thanks. > > > > Regards, > > Marcus > > >