Hi Ignite team,

Can you please advise if there are anything that we can check on the below? 
Thanks.

Regards,
Marcus

From: Lo, Marcus [ICG-IT]
Sent: Wednesday, March 30, 2022 11:55 AM
To: user
Subject: Node crashed with error "Getting affinity for too old topology version 
that is already out of history"

Hi,

We are using Ignite 2.10.0 and have 5 nodes (with consistentId/hostname - 
lrdeqprmap01p, lrdeqprmap02p, lrdeqprmap03p, lcgeqprmap03c, lcgeqprmap04c) in 
the cluster, and at one time 2 of the nodes (lcgeqprmap03c, lcgeqprmap04c) were 
out due to power outage. Somehow another node lrdeqprmap03p shut down shortly 
after that, with the following error:

2022-03-29 14:32:01.996+0100 ERROR [query-#194160%Prism%]                       
                   : Critical system error detected. Will be handled 
accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler 
[tryStop=false, timeout=0, super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class 
o.a.i.i.processors.cache.persistence.tree.CorruptedTreeException: B+Tree is 
corrupted [pages(groupId, pageId)=[], cacheId=388652627, 
cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure on 
bounds: [lower=null, upper=null]]]]
org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
 B+Tree is corrupted [pages(groupId, pageId)=[], cacheId=388652627, 
cacheName=LIMIT_DASHBOARD_SNAPSHOT, indexName=_key_PK, msg=Runtime failure on 
bounds: [lower=null, upper=null]]
                at 
org.apache.ignite.internal.processors.query.h2.database.H2Tree.corruptedTreeException(H2Tree.java:977)
 ~[ignite-indexing-2.10.0.jar:2.10.0]
                at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1133)
 ~[ignite-core-2.10.0.jar:2.10.0]
                at 
org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.find(H2TreeIndex.java:415)
 ~[ignite-indexing-2.10.0.jar:2.10.0]
...
Caused by: 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTreeRuntimeException:
 java.lang.IllegalStateException: Getting affinity for too old topology version 
that is already out of history [locNode=TcpDiscoveryNode 
[id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, 
addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet 
[/127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], 
discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, loc=true, 
ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], 
grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, 
minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, 
minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], 
history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], 
AffinityTopologyVersion [topVer=19, minorTopVer=0]]]
                at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findLowerUnbounded(BPlusTree.java:1079)
 ~[ignite-core-2.10.0.jar:2.10.0]
                at 
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.find(BPlusTree.java:1118)
 ~[ignite-core-2.10.0.jar:2.10.0]
                ... 23 more
Caused by: java.lang.IllegalStateException: Getting affinity for too old 
topology version that is already out of history [locNode=TcpDiscoveryNode 
[id=e21d561d-314a-4240-a379-23f139870717, consistentId=lrdeqprmap03p, 
addrs=ArrayList [127.0.0.1, 169.182.110.133], sockAddrs=HashSet 
[/127.0.0.1:47500, lrdeqprmap03p.eur.nsroot.net/169.182.110.133:47500], 
discPort=47500, order=7, intOrder=7, lastExchangeTime=1648560721652, loc=true, 
ver=2.10.0#20210310-sha1:bc24f6ba, isClient=false], 
grp=LimitDashboardSnapshotCache, topVer=AffinityTopologyVersion [topVer=17, 
minorTopVer=28], lastAffChangeTopVer=AffinityTopologyVersion [topVer=17, 
minorTopVer=28], head=AffinityTopologyVersion [topVer=19, minorTopVer=0], 
history=[AffinityTopologyVersion [topVer=18, minorTopVer=0], 
AffinityTopologyVersion [topVer=19, minorTopVer=0]]]
                at 
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:831)
 ~[ignite-core-2.10.0.jar:2.10.0]
                at 
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:778)
 ~[ignite-core-2.10.0.jar:2.10.0]
                at 
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.nodes(GridAffinityAssignmentCache.java:686)
 ~[ignite-core-2.10.0.jar:2.10.0]
                ...

I suppose Ignite should be fault tolerant and outage of some nodes should not 
cause shutdown of other nodes. Can you please advise? I have attached the full 
log for your reference. Thanks.

Regards,
Marcus

Reply via email to