[ 
https://issues.apache.org/jira/browse/IGNITE-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ksenia Rybakova updated IGNITE-5707:
------------------------------------
    Description: 
Load test config:
- CacheRandomOperationBenchmark
- 8 clients, 48 servers at 8 hosts
- 26 physical caches of different types with different memory policies + 30 
groups with 10 partitioned caches each + 20 groups with 10 replicated caches 
each. Total 526 caches.
- Preloading amount: 50K, key range: 60K
Complete configs are attached.

3 of 8 clients have following messages during preloading:
{noformat}
[12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture 
[rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, 
futs=[true, false, false]][12:17:56] (err) Failed to
execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
false, false, false, false, false, false, false, false,
 false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false]][12:17:56] (err) Failed
to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
false, false, false, false, false, false, false, fal
se, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false]]class org.apache.igni
te.IgniteCheckedException: DataStreamer request failed 
[node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will 
retry data transfer at stable topology [reqTop=AffinityTopologyVersion 
[topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
[topVer=56, minorTopVer=1], node=remote]
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
        ... 7 more
{noformat}
2 drivers were able to resume streaming after some time, but 1 didn't (error 
messages continued to be printed). This driver had high heap utilization, that 
resulted in long GC pause. Finally it was considered failed by other nodes.

Also all clients have in log:
{noformat}
[2017-07-06 12:18:11,780][INFO ][exchange-worker-#41%null%][time] Started 
exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
crd=false, evt=18, node=TcpDiscoveryNode 
[id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], discPort=0, 
order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
ver=2.1.0#20170705-sha1:ad42f620, isClient=true], evtNode=TcpDiscoveryNode 
[id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], discPort=0, 
order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
ver=2.1.0#20170705-sha1:ad42f620, isClient=true], 
customEvt=CacheAffinityChangeMessage 
[id=b5a8f271d51-f9bb8d96-c609-4de4-b32f-761c2a33ad10, 
topVer=AffinityTopologyVersion [topVer=48, minorTopVer=0], exchId=null, 
partsMsg=null, exchangeNeeded=true]]
[2017-07-06 12:18:12,284][WARN ][pool-5-thread-2][GridDhtPartitionTopologyImpl] 
Requested topology version does not match calculated diff, will require full 
iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
minorTopVer=0], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=1]]
[2017-07-06 12:18:12,642][WARN ][sys-#56%null%][GridDhtPartitionTopologyImpl] 
Requested topology version does not match calculated diff, will require full 
iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
minorTopVer=1], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=0]]
[2017-07-06 12:18:12,919][INFO ][exchange-worker-#41%null%][time] Finished 
exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
crd=false]
{noformat}

  was:
Load test config:
- CacheRandomOperationBenchmark
- 8 clients, 48 servers at 8 hosts
- 26 physical caches of different types with different memory policies + 30 
groups with 10 partitioned caches each + 20 groups with 10 replicated caches 
each. Total 526 caches.
- Preloading amount: 50K, key range: 60K
Complete configs are attached.

3 of 8 clients have following messages during preloading:
{noformat}
[12:17:56] (err) Failed to execute compound future reducer: GridCompoundFuture 
[rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, 
futs=[true, false, false]][12:17:56] (err) Failed to
execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
false, false, false, false, false, false, false, false,
 false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false]][12:17:56] (err) Failed
to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
false, false, false, false, false, false, false, fal
se, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false, false, false, false, false, false, false, 
false, false, false, false, false]]class org.apache.igni
te.IgniteCheckedException: DataStreamer request failed 
[node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will 
retry data transfer at stable topology [reqTop=AffinityTopologyVersion 
[topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
[topVer=56, minorTopVer=1], node=remote]
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
        at 
org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
        ... 7 more
{noformat}
2 drivers were able to resume streaming after some time, but 1 didn't (error 
messages continued to be printed). This driver had high heap utilization, that 
resulted in long GC pause. Finally it was considered failed by other nodes.

Also it was observed that all clients have in log:
{noformat}
[2017-07-06 12:18:11,780][INFO ][exchange-worker-#41%null%][time] Started 
exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
crd=false, evt=18, node=TcpDiscoveryNode 
[id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], discPort=0, 
order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
ver=2.1.0#20170705-sha1:ad42f620, isClient=true], evtNode=TcpDiscoveryNode 
[id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], discPort=0, 
order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
ver=2.1.0#20170705-sha1:ad42f620, isClient=true], 
customEvt=CacheAffinityChangeMessage 
[id=b5a8f271d51-f9bb8d96-c609-4de4-b32f-761c2a33ad10, 
topVer=AffinityTopologyVersion [topVer=48, minorTopVer=0], exchId=null, 
partsMsg=null, exchangeNeeded=true]]
[2017-07-06 12:18:12,284][WARN ][pool-5-thread-2][GridDhtPartitionTopologyImpl] 
Requested topology version does not match calculated diff, will require full 
iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
minorTopVer=0], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=1]]
[2017-07-06 12:18:12,642][WARN ][sys-#56%null%][GridDhtPartitionTopologyImpl] 
Requested topology version does not match calculated diff, will require full 
iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
minorTopVer=1], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=0]]
[2017-07-06 12:18:12,919][INFO ][exchange-worker-#41%null%][time] Finished 
exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
crd=false]
{noformat}


> Client can't resume streaming even after topology got stable during load test
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-5707
>                 URL: https://issues.apache.org/jira/browse/IGNITE-5707
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.1
>            Reporter: Ksenia Rybakova
>
> Load test config:
> - CacheRandomOperationBenchmark
> - 8 clients, 48 servers at 8 hosts
> - 26 physical caches of different types with different memory policies + 30 
> groups with 10 partitioned caches each + 20 groups with 10 replicated caches 
> each. Total 526 caches.
> - Preloading amount: 50K, key range: 60K
> Complete configs are attached.
> 3 of 8 clients have following messages during preloading:
> {noformat}
> [12:17:56] (err) Failed to execute compound future reducer: 
> GridCompoundFuture [rdc=null, initFlag=1, lsnrCalls=0, done=false, 
> cancelled=false, err=null, futs=[true, false, false]][12:17:56] (err) Failed 
> to
> execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
> lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
> false, false, false, false, false, false, false, false,
>  false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false]][12:17:56] (err) Failed
> to execute compound future reducer: GridCompoundFuture [rdc=null, initFlag=1, 
> lsnrCalls=0, done=false, cancelled=false, err=null, futs=[true, true, false, 
> false, false, false, false, false, false, false, fal
> se, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false, false, false, false, false, false, false, 
> false, false, false, false, false]]class org.apache.igni
> te.IgniteCheckedException: DataStreamer request failed 
> [node=16a20d0c-4009-4bfa-ad6e-0261d9e3b2a3]
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.onResponse(DataStreamerImpl.java:1785)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$3.onMessage(DataStreamerImpl.java:333)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
>         at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: class org.apache.ignite.IgniteCheckedException: DataStreamer will 
> retry data transfer at stable topology [reqTop=AffinityTopologyVersion 
> [topVer=56, minorTopVer=0], topVer=AffinityTopologyVersion
> [topVer=56, minorTopVer=1], node=remote]
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.localUpdate(DataStreamProcessor.java:343)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.processRequest(DataStreamProcessor.java:301)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.access$000(DataStreamProcessor.java:58)
>         at 
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor$1.onMessage(DataStreamProcessor.java:88)
>         ... 7 more
> {noformat}
> 2 drivers were able to resume streaming after some time, but 1 didn't (error 
> messages continued to be printed). This driver had high heap utilization, 
> that resulted in long GC pause. Finally it was considered failed by other 
> nodes.
> Also all clients have in log:
> {noformat}
> [2017-07-06 12:18:11,780][INFO ][exchange-worker-#41%null%][time] Started 
> exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
> crd=false, evt=18, node=TcpDiscoveryNode 
> [id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
> sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], 
> discPort=0, order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
> ver=2.1.0#20170705-sha1:ad42f620, isClient=true], evtNode=TcpDiscoveryNode 
> [id=71174cde-6bcf-43b6-a97b-02e0c987a8da, addrs=[127.0.0.1, 172.25.1.31], 
> sockAddrs=[testagent01.gridgain.local/172.25.1.31:0, /127.0.0.1:0], 
> discPort=0, order=49, intOrder=0, lastExchangeTime=1499332605385, loc=true, 
> ver=2.1.0#20170705-sha1:ad42f620, isClient=true], 
> customEvt=CacheAffinityChangeMessage 
> [id=b5a8f271d51-f9bb8d96-c609-4de4-b32f-761c2a33ad10, 
> topVer=AffinityTopologyVersion [topVer=48, minorTopVer=0], exchId=null, 
> partsMsg=null, exchangeNeeded=true]]
> [2017-07-06 12:18:12,284][WARN 
> ][pool-5-thread-2][GridDhtPartitionTopologyImpl] Requested topology version 
> does not match calculated diff, will require full iteration tocalculate 
> mapping [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=0], 
> diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=1]]
> [2017-07-06 12:18:12,642][WARN ][sys-#56%null%][GridDhtPartitionTopologyImpl] 
> Requested topology version does not match calculated diff, will require full 
> iteration tocalculate mapping [topVer=AffinityTopologyVersion [topVer=56, 
> minorTopVer=1], diffVer=AffinityTopologyVersion [topVer=56, minorTopVer=0]]
> [2017-07-06 12:18:12,919][INFO ][exchange-worker-#41%null%][time] Finished 
> exchange init [topVer=AffinityTopologyVersion [topVer=56, minorTopVer=1], 
> crd=false]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to