[ 
https://issues.apache.org/jira/browse/IGNITE-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ksenia Rybakova updated IGNITE-5707:
------------------------------------
    Attachment: run-load.properties
                run-load.xml
                ignite-base-load-config.xml

> Client can't resume streaming even after topology got stable during load test
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-5707
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5707
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.1
>            Reporter: Ksenia Rybakova
>         Attachments: ignite-base-load-config.xml, run-load.properties, 
> run-load.xml
>
>
> Load test config:
> - CacheRandomOperationBenchmark
> - 8 clients, 48 servers at 8 hosts
> - 26 physical caches of different types with different memory policies + 30 
> groups with 10 partitioned caches each + 20 groups with 10 replicated caches 
> each. Total 526 caches.
> - Preloading amount: 50K, key range: 60K
> Complete configs are attached.
> 3 of 8 clients have following messages during preloading:
> {noformat}
> [12:17:56] (err) Failed to execute compound future reducer: 
> GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false, 
> cancelled=false, err=null, futs=[true, false, false]][12:17:56] (err) Failed 
> to
> execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
> lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
> false, false, false, false, false, false, false, false,
>  false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false]][12:17:56] (err) Failed
> to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
> lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
> false, false, false, false, false, false, false, fal
> se, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false]]class org.apache.igni
> te.IgniteCheckedException: DataStreamer request failed 
> [node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will 
> retry data transfer at stable topology [reqTop=AffinityTopologyVersion 
> [topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
> [topVer=56, minorTopVer=1], node=remote]
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
>         ... 7 more
> {noformat}
> 2 drivers were able to resume streaming after some time, but 1 didn't (error 
> messages continued to be printed). This driver had high heap utilization, 
> that resulted in long GC pause. Finally it was considered failed by other 
> nodes.
> Also all clients have in log (not sure if it's connected to the problem):
> {noformat}
> [2017-07-06 12:18:11,780][INFO ][exchange-worker-#41%null%][time] Started 
> exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
> crd=false, evt=18, node=TcpDiscoveryNode 
> [id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
> sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], 
> discPort=0, order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
> ver=2.1.0#20170705-sha1:ad42f620, isClient=true], evtNode=TcpDiscoveryNode 
> [id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
> sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], 
> discPort=0, order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
> ver=2.1.0#20170705-sha1:ad42f620, isClient=true], 
> customEvt=CacheAffinityChangeMessage 
> [id=b5a8f271d51-f9bb8d96-c609-4de4-b32f-761c2a33ad10, 
> topVer=AffinityTopologyVersion [topVer=48, minorTopVer=0], exchId=null, 
> partsMsg=null, exchangeNeeded=true]]
> [2017-07-06 12:18:12,284][WARN 
> ][pool-5-thread-2][GridDhtPartitionTopologyImpl] Requested topology version 
> does not match calculated diff, will require full iteration tocalculate 
> mapping [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=0], 
> diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=1]]
> [2017-07-06 12:18:12,642][WARN ][sys-#56%null%][GridDhtPartitionTopologyImpl] 
> Requested topology version does not match calculated diff, will require full 
> iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
> minorTopVer=1], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=0]]
> [2017-07-06 12:18:12,919][INFO ][exchange-worker-#41%null%][time] Finished 
> exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
> crd=false]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to