Anton Kalashnikov created IGNITE-12709: ------------------------------------------
Summary: Server latch initialized after client latch in Zookeeper discovery Key: IGNITE-12709 URL: https://issues.apache.org/jira/browse/IGNITE-12709 Project: Ignite Issue Type: Bug Reporter: Anton Kalashnikov Assignee: Anton Kalashnikov The coordinator node missed latch message from the client because it doesn't receive a triggered message of exchange. So it leads to infinity wait of answer from the coordinator. {noformat} [2019-10-23 12:49:42,110]\[ERROR]\[sys-#39470%continuous.GridEventConsumeSelfTest0%]\[GridIoManager] An error occurred processing the message \[msg=GridIoMessage \[plc=2, topic=TOPIC_EXCHANGE, topicOrd=31, ordered=fa lse, timeout=0, skipOnTimeout=false, msg=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.LatchAckMessage@7699f4f2], nodeId=857a40a8-f384-4740-816c-dd54d3a00001]. class org.apache.ignite.IgniteException: Topology AffinityTopologyVersion \[topVer=54, minorTopVer=0] not found in discovery history ; consider increasing IGNITE_DISCOVERY_HISTORY_SIZE property. Current value is -1 at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.aliveNodesForTopologyVer(ExchangeLatchManager.java:292) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.getLatchCoordinator(ExchangeLatchManager.java:334) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.processAck(ExchangeLatchManager.java:379) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.ExchangeLatchManager.lambda$new$0(ExchangeLatchManager.java:119) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1632) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1252) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:143) at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1143) at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2019-10-23 12:50:02,106]\[WARN ]\[exchange-worker-#39517%continuous.GridEventConsumeSelfTest1%]\[GridDhtPartitionsExchangeFuture] Unable to await partitions release latch within timeout: ClientLatch \[coordinator=ZookeeperClusterNode \[id=760ca6b5-f30b-4c40-81b1-5b602c200000, addrs=\[127.0.0.1], order=1, loc=false, client=false], ackSent=true, super=CompletableLatch \[id=CompletableLatchUid \[id=exchange, topVer=AffinityTopologyVersion \[topVer=54, minorTopVer=0]]]] [2019-10-23 12:50:02,192]\[WARN ]\[exchange-worker-#39469%continuous.GridEventConsumeSelfTest0%]\[GridDhtPartitionsExchangeFuture] Unable to await partitions release latch within timeout: ServerLatch \[permits=1, pendingAcks=HashSet \[06c3094b-c1f3-4fe8-81e8-22cb66000002], super=CompletableLatch \[id=CompletableLatchUid \[id=exchange, topVer=AffinityTopologyVersion \[topVer=54, minorTopVer=0]]]] {noformat} Reproduced by org.apache.ignite.internal.processors.continuous.GridEventConsumeSelfTest#testMultithreadedWithNodeRestart -- This message was sent by Atlassian Jira (v8.3.4#803005)