[
https://issues.apache.org/jira/browse/IGNITE-9793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Voronkin updated IGNITE-9793:
-----------------------------------
Description:
There is coordinator and ring of nodes
coordinator -> 1 -> 2 - > 3 -> 4
coordinator deactivated:
2018-09-24 15:09:01.609 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:01.620
[DEBUG][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager]
Server latch is created [latch=CompletableLatchUid{id='exchange',
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]},
participantsSize=160]
2018-09-24 15:09:01.621 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%]
nodes 1, 2, 3, 4 were deactivated:
2018-09-24 15:09:01.609 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.328 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=22a58223-47b5-43c2-897b-e70e8e50edf7, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.334 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=973eb8ce-3b8c-463d-a6ab-00ac66d93f13, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.332 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=a904bac4-aaed-4f69-90f3-c13bc4d331d1, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
Node 2 SEGMENTED
2018-09-24 15:17:50.068 [WARN
][tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%|#2%DPL_GRID%DplGridNodeName%][o.a.i.s.d.tcp.TcpDiscoverySpi]
Node is out of topology (probably, due to short-time network problems).
2018-09-24 15:17:50.069 [WARN
][disco-event-worker-#152%DPL_GRID%DplGridNodeName%|#152%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]
Local node SEGMENTED: TcpDiscoveryNode
[id=a904bac4-aaed-4f69-90f3-c13bc4d331d1, addrs=ArrayList [10.116.206.98],
sockAddrs=HashSet [grid724.domain/10.116.206.98:47500], discPort=47500,
order=110, intOrder=110, lastExchangeTime=1537791470063, loc=true,
ver=2.5.1#20180906-sha1:ebde6c79, isClient=false]
Coordinator started activation on topology without node2
2018-09-24 15:19:48.686 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Start activation process [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4,
client=false, topVer=AffinityTopologyVersion [topVer=188, minorTopVer=1]]
But node 3 which is next to node 2 haven't received activation message.
Coordinator sent activation to all except 3.
Node 3 haven't received activation message.
2018-09-24 15:24:25.911 [INFO
][sys-#28144%DPL_GRID%DplGridNodeName%|#28144%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Coordinator received single message [ver=AffinityTopologyVersion [topVer=188,
minorTopVer=1], node=073f1598-6b70-49df-8f45-126735611775, allReceived=false]
GridDhtPartitionsExchangeFuture hangs forever.
was:
There is coordinator and ring of nodes
coordinator -> 1 -> 2 - > 3 -> 4
coordinator deactivated:
2018-09-24 15:09:01.609 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:01.620
[DEBUG][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager]
Server latch is created [latch=CompletableLatchUid\{id='exchange',
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]},
participantsSize=160]
2018-09-24 15:09:01.621 [INFO ][exchange-worker-#153%DPL_GRID%DplGridNodeName%]
nodes 1, 2, 3, 4 were deactivated:
2018-09-24 15:09:01.609 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.328 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=22a58223-47b5-43c2-897b-e70e8e50edf7, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.334 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=973eb8ce-3b8c-463d-a6ab-00ac66d93f13, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
2018-09-24 15:09:03.332 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Successfully deactivated data structures, services and caches
[nodeId=a904bac4-aaed-4f69-90f3-c13bc4d331d1, client=false,
topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
Node 2 SEGMENTED
2018-09-24 15:17:50.068 [WARN
][tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%][o.a.i.s.d.tcp.TcpDiscoverySpi]
Node is out of topology (probably, due to short-time network problems).
2018-09-24 15:17:50.069 [WARN
][disco-event-worker-#152%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]
Local node SEGMENTED: TcpDiscoveryNode
[id=a904bac4-aaed-4f69-90f3-c13bc4d331d1, addrs=ArrayList [10.116.206.98],
sockAddrs=HashSet [grid724.domain/10.116.206.98:47500], discPort=47500,
order=110, intOrder=110, lastExchangeTime=1537791470063, loc=true,
ver=2.5.1#20180906-sha1:ebde6c79, isClient=false]
Coordinator started activation on topology without node2
2018-09-24 15:19:48.686 [INFO
][exchange-worker-#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Start activation process [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4,
client=false, topVer=AffinityTopologyVersion [topVer=188, minorTopVer=1]]
But node 3 which is next to node 2 haven't received activation message.
Coordinator sent activation to all except
2018-09-24 15:24:25.911 [INFO
][sys-#28144%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
Coordinator received single message [ver=AffinityTopologyVersion [topVer=188,
minorTopVer=1], node=073f1598-6b70-49df-8f45-126735611775, allReceived=false]
GridDhtPartitionsExchangeFuture hangs forever.
> Deactivation, segmentation of one node, activation may lead to hang
> activation forever
> --------------------------------------------------------------------------------------
>
> Key: IGNITE-9793
> URL: https://issues.apache.org/jira/browse/IGNITE-9793
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.5
> Reporter: Pavel Voronkin
> Priority: Major
>
> There is coordinator and ring of nodes
> coordinator -> 1 -> 2 - > 3 -> 4
> coordinator deactivated:
> 2018-09-24 15:09:01.609 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Successfully deactivated data structures, services and caches
> [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
> 2018-09-24 15:09:01.620
> [DEBUG][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.l.ExchangeLatchManager]
> Server latch is created [latch=CompletableLatchUid{id='exchange',
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]},
> participantsSize=160]
> 2018-09-24 15:09:01.621 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%]
> nodes 1, 2, 3, 4 were deactivated:
> 2018-09-24 15:09:01.609 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Successfully deactivated data structures, services and caches
> [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4, client=false,
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
> 2018-09-24 15:09:03.328 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Successfully deactivated data structures, services and caches
> [nodeId=22a58223-47b5-43c2-897b-e70e8e50edf7, client=false,
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
> 2018-09-24 15:09:03.334 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Successfully deactivated data structures, services and caches
> [nodeId=973eb8ce-3b8c-463d-a6ab-00ac66d93f13, client=false,
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
> 2018-09-24 15:09:03.332 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Successfully deactivated data structures, services and caches
> [nodeId=a904bac4-aaed-4f69-90f3-c13bc4d331d1, client=false,
> topVer=AffinityTopologyVersion [topVer=183, minorTopVer=1]]
> Node 2 SEGMENTED
> 2018-09-24 15:17:50.068 [WARN
> ][tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%|#2%DPL_GRID%DplGridNodeName%][o.a.i.s.d.tcp.TcpDiscoverySpi]
> Node is out of topology (probably, due to short-time network problems).
> 2018-09-24 15:17:50.069 [WARN
> ][disco-event-worker-#152%DPL_GRID%DplGridNodeName%|#152%DPL_GRID%DplGridNodeName%][o.a.i.i.m.d.GridDiscoveryManager]
> Local node SEGMENTED: TcpDiscoveryNode
> [id=a904bac4-aaed-4f69-90f3-c13bc4d331d1, addrs=ArrayList [10.116.206.98],
> sockAddrs=HashSet [grid724.domain/10.116.206.98:47500], discPort=47500,
> order=110, intOrder=110, lastExchangeTime=1537791470063, loc=true,
> ver=2.5.1#20180906-sha1:ebde6c79, isClient=false]
> Coordinator started activation on topology without node2
> 2018-09-24 15:19:48.686 [INFO
> ][exchange-worker-#153%DPL_GRID%DplGridNodeName%|#153%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Start activation process [nodeId=e002e011-8d1c-4353-a0f3-b71264c5b0f4,
> client=false, topVer=AffinityTopologyVersion [topVer=188, minorTopVer=1]]
> But node 3 which is next to node 2 haven't received activation message.
> Coordinator sent activation to all except 3.
> Node 3 haven't received activation message.
> 2018-09-24 15:24:25.911 [INFO
> ][sys-#28144%DPL_GRID%DplGridNodeName%|#28144%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture]
> Coordinator received single message [ver=AffinityTopologyVersion
> [topVer=188, minorTopVer=1], node=073f1598-6b70-49df-8f45-126735611775,
> allReceived=false]
> GridDhtPartitionsExchangeFuture hangs forever.
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)