Hello!

I can see that some data processing is happening in thread dumps, but also
this:

[11:16:11,637][INFO][grid-nio-worker-tcp-comm-2-#26][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38624]
[11:16:12,686][SEVERE][grid-nio-worker-tcp-comm-2-#26][TcpCommunicationSpi]
Failed to process selector key [ses=GridSelectorNioSessionImpl
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=2,
bytesRcvd=430031923, bytesSent=2154539, bytesRcvd0=6974058,
bytesSent0=1976, select=true, super=GridWorker
[name=grid-nio-worker-tcp-comm-2, igniteInstanceName=null, finished=false,
heartbeatTs=1581074171663, hashCode=1764437028, interrupted=false,
runner=grid-nio-worker-tcp-comm-2-#26]]],
writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768],
inRecovery=GridNioRecoveryDescriptor [acked=384, resendCnt=0, rcvCnt=422,
sentCnt=413, reserved=true, lastAck=416, nodeLeft=false,
node=TcpDiscoveryNode [id=a66a573a-43dc-48d2-8ee5-232e727acbc9,
addrs=[10.139.64.10, 127.0.0.1], sockAddrs=[/10.139.64.10:0, /127.0.0.1:0],
discPort=0, order=19, intOrder=19, lastExchangeTime=1581073961809,
loc=false, ver=2.7.6#20190911-sha1:21f7ca41, isClient=true],
connected=true, connectCnt=0, queueLimit=4096, reserveCnt=1,
pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=384,
resendCnt=0, rcvCnt=422, sentCnt=413, reserved=true, lastAck=416,
nodeLeft=false, node=TcpDiscoveryNode
[id=a66a573a-43dc-48d2-8ee5-232e727acbc9, addrs=[10.139.64.10, 127.0.0.1],
sockAddrs=[/10.139.64.10:0, /127.0.0.1:0], discPort=0, order=19,
intOrder=19, lastExchangeTime=1581073961809, loc=false,
ver=2.7.6#20190911-sha1:21f7ca41, isClient=true], connected=true,
connectCnt=0, queueLimit=4096, reserveCnt=1, pairedConnections=false],
super=GridNioSessionImpl [locAddr=/172.16.1.7:47100, rmtAddr=/
10.139.0.10:37846, createTime=1581073963095, closeTime=0, bytesSent=78611,
bytesRcvd=104294928, bytesSent0=561, bytesRcvd0=916098,
sndSchedTime=1581073963095, lastSndTime=1581074171592,
lastRcvTime=1581074171612, readsPaused=false,
filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=o.a.i.i.util.nio.GridDirectParser@672e22f0, directMode=true],
GridConnectionBytesVerifyFilter], accepted=true, markedForClose=false]]]
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377)
at
org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1282)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2386)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2153)
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1794)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)

[11:16:46,612][INFO][grid-nio-worker-tcp-comm-2-#26][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]
[11:16:46,928][INFO][grid-nio-worker-tcp-comm-3-#27][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38900]
[11:16:46,985][INFO][grid-nio-worker-tcp-comm-3-#27][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]
[11:16:47,301][INFO][grid-nio-worker-tcp-comm-0-#24][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38902]
[11:16:47,359][INFO][grid-nio-worker-tcp-comm-0-#24][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]
[11:16:47,675][INFO][grid-nio-worker-tcp-comm-1-#25][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38904]
[11:16:47,733][INFO][grid-nio-worker-tcp-comm-1-#25][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]
[11:16:48,049][INFO][grid-nio-worker-tcp-comm-2-#26][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38916]
[11:16:48,106][INFO][grid-nio-worker-tcp-comm-2-#26][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]
[11:16:48,423][INFO][grid-nio-worker-tcp-comm-3-#27][TcpCommunicationSpi]
Accepted incoming communication connection [locAddr=/172.16.1.7:47100,
rmtAddr=/10.139.0.10:38918]
[11:16:48,481][INFO][grid-nio-worker-tcp-comm-3-#27][TcpCommunicationSpi]
Received incoming connection from remote node while connecting to this
node, rejecting [locNode=c7e6fc55-d367-43d5-94e9-79ef1d984601,
locNodeOrder=1, rmtNode=a66a573a-43dc-48d2-8ee5-232e727acbc9,
rmtNodeOrder=19]

It's a bad sign. I think you either have network problems, or maxed out
your communication.

I recommend the following configuration change to TcpCommunicationSpi:

socketWriteTimeout 5000
usePairedConnections true
connectionsPerNode 4.

You may also like to assign localAddr to known good (reachable) IP address
of the node, on each node.

Regards,
-- 
Ilya Kasnacheev


пт, 7 февр. 2020 г. в 14:34, pg31 <singhhoneyyo...@gmail.com>:

> Thanks Ilya.
>
> I have changed the Client Side Machine to prefer IPv4 Stack and hence that
> error went away. But still the data-streamer-stripes and tcp-comm-worker
> threads keep getting stuck.
>
> I am attaching the logs again. (These contain the thread-dump themselves)
> log.zip <http://apache-ignite-users.70518.x6.nabble.com/file/t2770/log.zip>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Reply via email to