Hi Ignite team, Can you please advise if there are anything that we can check on the below? Thanks.
Regards, Marcus From: Lo, Marcus [ICG-IT] Sent: Wednesday, March 30, 2022 11:55 AM To: user Subject: Node crashed with error "Getting affinity for too old topology version that is already out of history" Hi, We are using Ignite 2.10.0 and have 5 nodes (with consistentId/hostname - lrdeqprmap01p, lrdeqprmap02p, lrdeqprmap03p, lcgeqprmap03c, lcgeqprmap04c) in the cluster, and at one time 2 of the nodes (lcgeqprmap03c, lcgeqprmap04c) were out due to power outage. Somehow another node lrdeqprmap03p shut down shortly after that, with the following error: 2022-03-29 14:32:01.996+0100 ERROR [query-#194160%Prism%] : Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[], cacheId=388652627, cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure on bounds: [lower=null, upper=null]]]] org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is corrupted [pages(groupId, pageId)=[], cacheId=388652627, cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure on bounds: [lower=null, upper=null]] at org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:977) ~[ignite-indexing-2.10.0.jar:2.10.0] at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1133) ~[ignite-core-2.10.0.jar:2.10.0] at org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.find(H2TreeIndex.java:415) ~[ignite-indexing-2.10.0.jar:2.10.0] ... Caused by: org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException: java.lang.IllegalStateException: Getting affinity for too old topology version that is already out of history [locNode=TcpDiscoveryNode [id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet [/127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, loc=true, ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], AffinityTopologyVersion [topVer=19, minorTopVer=0]]] at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1079) ~[ignite-core-2.10.0.jar:2.10.0] at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1118) ~[ignite-core-2.10.0.jar:2.10.0] ... 23 more Caused by: java.lang.IllegalStateException: Getting affinity for too old topology version that is already out of history [locNode=TcpDiscoveryNode [id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet [/127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, loc=true, ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], AffinityTopologyVersion [topVer=19, minorTopVer=0]]] at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:831) ~[ignite-core-2.10.0.jar:2.10.0] at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:778) ~[ignite-core-2.10.0.jar:2.10.0] at org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.nodes(GridAffinityAssignmentCache.java:686) ~[ignite-core-2.10.0.jar:2.10.0] ... I suppose Ignite should be fault tolerant and outage of some nodes should not cause shutdown of other nodes. Can you please advise? I have attached the full log for your reference. Thanks. Regards, Marcus