Ksenia Rybakova created IGNITE-5707: ---------------------------------------
Summary: Client can't resume streaming even after topology got stable during load test Key: IGNITE-5707 URL: https://issues.apache.org/jira/browse/IGNITE-5707 Project: Ignite Issue Type: Bug Affects Versions: 2.1 Reporter: Ksenia Rybakova Load test config: - CacheRandomOperationBenchmark - 8 clients, 48 servers at 8 hosts - 26 physical caches of different types with different memory policies + 30 groups with 10 partitioned caches each + 20 groups with 10 replicated caches each. Total 526 caches. - Preloading amount: 50K, key range: 60K Complete configs are attached. 3 of 8 clients have following messages during preloading: {noformat} [12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, false, false]][12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false]][12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, false, false, false, false, false, false, false, fal se, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false, false]]class org.apache.igni te.IgniteCheckedException: DataStreamer request failed [node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3] at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785) at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will retry data transfer at stable topology [reqTop=AffinityTopologyVersion [topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], node=remote] at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343) at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301) at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58) at org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88) ... 7 more {noformat} 2 drivers were able to resume streaming after some time, but 1 didn't (error messages continued to be printed). This driver had high heap utilization, that resulted in long GC pause. Finally it was considered failed by other nodes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)