Hi, I see that you use both ipv4 and ipv6 stacks, I think it can somehow affect this behavior, so, I would recommend enabling -Djava.net.preferIPv4Stack=true for all nodes in the cluster(including clients). Also, why do you decided to set such parameters to the communicationSPI:
<property name="messageQueueLimit" value="3000"/> <property name="socketWriteTimeout" value="6000"/> <property name="socketSendBuffer" value="65536"/> If after suggested change you will still witness this problem, please provide full logs from all nodes in cluster. Evgenii 2018-01-23 9:35 GMT+03:00 xiang jie <xia...@cosconetwork.com.cn>: > Hi, > > > > We have deployed ignite 2.3 and created about 100 caches and load data > from oracle into these caches. > > > > But clients(started in developing tomcat web apps, we often start and stop > these apps) often cannot connect the ignite server after one or two days. > By searching logs, we found below errors in server side: > > > > > > [2018-01-23 12:43:30,979][WARN > ][grid-timeout-worker-#23%igniteCosco%][TcpDiscoverySpi] > Socket write has timed out (consider increasing > 'IgniteConfiguration.failureDetectionTimeout' configuration property) > [failureDetectionTimeout=10000, rmtAddr=/172.41.27.66:3871, rmtPort=3871, > sockTimeout=5000] > > [2018-01-23 > 12:43:30,980][ERROR][tcp-disco-sock-reader-#63%igniteCosco%][TcpDiscoverySpi] > Caught exception on message read > [sock=Socket[addr=/172.41.27.66,port=3871,localport=47500], > locNodeId=89865e33-c722-4989-b663-a75c936b068f, > rmtNodeId=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9] > > class org.apache.ignite.IgniteCheckedException: Failed to deserialize > object with given class loader: sun.misc.Launcher$AppClassLoader@7390d1e8 > > at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(Jd > kMarshaller.java:129) > > at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller > .unmarshal(AbstractNodeNameAwareMarshaller.java:94) > > at org.apache.ignite.internal.util.IgniteUtils.unmarshal(Ignite > Utils.java:9740) > > at org.apache.ignite.spi.discovery.tcp.ServerImpl$SocketReader. > body(ServerImpl.java:5946) > > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > > Caused by: java.net.SocketException: Socket closed > > at java.net.SocketInputStream.socketRead0(Native Method) > > at java.net.SocketInputStream.read(SocketInputStream.java:152) > > at java.net.SocketInputStream.read(SocketInputStream.java:122) > > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > > at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > > at org.apache.ignite.marshaller.jdk.JdkMarshallerInputStreamWra > pper.read(JdkMarshallerInputStreamWrapper.java:53) > > at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputSt > ream.java:2310) > > at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectIn > putStream.java:2323) > > at java.io.ObjectInputStream$BlockDataInputStream.readShort( > ObjectInputStream.java:2794) > > at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream > .java:801) > > at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) > > at org.apache.ignite.marshaller.jdk.JdkMarshallerObjectInputStr > eam.<init>(JdkMarshallerObjectInputStream.java:39) > > at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(Jd > kMarshaller.java:119) > > ... 4 more > > [2018-01-23 > 12:43:30,980][DEBUG][tcp-disco-sock-reader-#63%igniteCosco%][TcpDiscoverySpi] > Client connection failed > [sock=Socket[addr=/172.41.27.66,port=3871,localport=47500], > locNodeId=89865e33-c722-4989-b663-a75c936b068f, > rmtNodeId=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9] > > [2018-01-23 12:43:30,981][INFO > ][tcp-disco-sock-reader-#63%igniteCosco%][TcpDiscoverySpi] > Finished serving remote node connection [rmtAddr=/172.41.27.66:3871, > rmtPort=3871 > > [2018-01-23 > 12:43:30,981][DEBUG][tcp-disco-sock-reader-#63%igniteCosco%][TcpDiscoverySpi] > Grid runnable finished normally: tcp-disco-sock-reader-#63%igniteCosco% > > [2018-01-23 > 12:43:30,981][ERROR][tcp-disco-client-message-worker-#64%igniteCosco%][TcpDiscoverySpi] > Client connection failed > [sock=Socket[addr=/172.41.27.66,port=3871,localport=47500], > locNodeId=89865e33-c722-4989-b663-a75c936b068f, > rmtNodeId=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9, > msg=TcpDiscoveryNodeAddedMessage [node=TcpDiscoveryNode > [id=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9, addrs=[0:0:0:0:0:0:0:1, > 127.0.0.1, 172.41.27.66, 192.168.126.1, 192.168.81.1, > 2001:0:9d38:6ab8:8aa:3888:2532:627e], > sockAddrs=[/2001:0:9d38:6ab8:8aa:3888:2532:627e:0, > /192.168.81.1:0, /127.0.0.1:0, /0:0:0:0:0:0:0:1:0, /192.168.126.1:0, / > 172.41.27.66:0], discPort=0, order=0, intOrder=340, > lastExchangeTime=1516682580894, loc=false, ver=2.3.0#20171028-sha1:8add7fd5, > isClient=true], dataPacket=o.a.i.spi.discovery > .tcp.internal.DiscoveryDataPacket@4d671e9d, discardMsgId=null, > discardCustomMsgId=null, top=[TcpDiscoveryNode > [id=fbf2344b-652b-4c5c-ad78-a470ec8c62e8, addrs=[0:0:0:0:0:0:0:1%lo, > 127.0.0.1, 172.40.0.205, 192.168.122.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, > /127.0.0.1:47500, /172.40.0.205:47500, /192.168.122.1:47500], > discPort=47500, order=618, intOrder=310, lastExchangeTime=1516673277178, > loc=false, ver=2.3.0#20171028-sha1:8add7fd5, isClient=false], > TcpDiscoveryNode [id=48e4ab39-0baa-40da-8dff-382253b2f72a, > addrs=[0:0:0:0:0:0:0:1, 10.10.11.93, 127.0.0.1, 192.168.0.93, > 192.168.1.90], sockAddrs=[/127.0.0.1:0, /0:0:0:0:0:0:0:1:0, / > 192.168.1.90:0, /10.10.11.93:0, /192.168.0.93:0], discPort=0, order=620, > intOrder=312, lastExchangeTime=1516673277178, loc=false, > ver=2.3.0#20171028-sha1:8add7fd5, isClient=true], TcpDiscoveryNode > [id=c57f1db3-878e-47c3-bd2b-f17882d0ba2d, addrs=[0:0:0:0:0:0:0:1%1, > 127.0.0.1, 172.17.0.1, 172.40.0.82, 192.168.2.1], sockAddrs=[/ > 172.40.0.82:47500, /127.0.0.1:47500, /0:0:0:0:0:0:0:1%1:47500, / > 192.168.2.1:47500, /172.17.0.1:47500], discPort=47500, order=624, > intOrder=316, lastExchangeTime=1516673277232, loc=false, > ver=2.3.0#20171028-sha1:8add7fd5, isClient=false], TcpDiscoveryNode > [id=89865e33-c722-4989-b663-a75c936b068f, addrs=[0:0:0:0:0:0:0:1%1, > 127.0.0.1, 172.40.0.84], sockAddrs=[cptest84/172.40.0.84:47500, / > 127.0.0.1:47500, /0:0:0:0:0:0:0:1%1:47500], discPort=47500, order=627, > intOrder=318, lastExchangeTime=1516682610520, loc=true, > ver=2.3.0#20171028-sha1:8add7fd5, isClient=false], TcpDiscoveryNode > [id=33b4f6a4-4cf9-4b71-8e82-32fe708c0879, addrs=[0:0:0:0:0:0:0:1, > 10.10.11.93, 127.0.0.1, 192.168.0.93, 192.168.1.90], sockAddrs=[/ > 127.0.0.1:0, /0:0:0:0:0:0:0:1:0, /192.168.1.90:0, /10.10.11.93:0, / > 192.168.0.93:0], discPort=0, order=640, intOrder=324, > lastExchangeTime=1516674179296, loc=false, ver=2.3.0#20171028-sha1:8add7fd5, > isClient=true], TcpDiscoveryNode [id=c7b18b5d-1259-4366-94bc-7af0317af115, > addrs=[0:0:0:0:0:0:0:1, 10.10.11.93, 127.0.0.1, 192.168.0.93, > 192.168.1.90], sockAddrs=[/127.0.0.1:0, /0:0:0:0:0:0:0:1:0, / > 192.168.1.90:0, /10.10.11.93:0, /192.168.0.93:0], discPort=0, order=641, > intOrder=325, lastExchangeTime=1516674232838, loc=false, > ver=2.3.0#20171028-sha1:8add7fd5, isClient=true], TcpDiscoveryNode > [id=e9c05694-deee-4f91-a477-7408d22a2bc2, addrs=[0:0:0:0:0:0:0:1, > 10.10.11.93, 127.0.0.1, 192.168.0.93, 192.168.1.90], sockAddrs=[/ > 127.0.0.1:0, /0:0:0:0:0:0:0:1:0, /192.168.1.90:0, /10.10.11.93:0, / > 192.168.0.93:0], discPort=0, order=652, intOrder=331, > lastExchangeTime=1516674382690, loc=false, ver=2.3.0#20171028-sha1:8add7fd5, > isClient=true], TcpDiscoveryNode [id=7b7b2351-dc75-481a-b27c-cd4c8d881448, > addrs=[0:0:0:0:0:0:0:1, 10.10.11.93, 127.0.0.1, 192.168.0.93, > 192.168.1.90], sockAddrs=[/127.0.0.1:0, /0:0:0:0:0:0:0:1:0, / > 192.168.1.90:0, /10.10.11.93:0, /192.168.0.93:0], discPort=0, order=657, > intOrder=333, lastExchangeTime=1516674490267, loc=false, > ver=2.3.0#20171028-sha1:8add7fd5, isClient=true]], clientTop=null, > gridStartTime=1516375081818, super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=20628b02161-fbf2344b-652b-4c5c-ad78-a470ec8c62e8, > verifierNodeId=fbf2344b-652b-4c5c-ad78-a470ec8c62e8, topVer=0, > pendingIdx=0, failedNodes=null, isClient=true]]] > > java.net.SocketException: Socket closed > > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:121) > > at java.net.SocketOutputStream.write(SocketOutputStream.java:147) > > at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeToS > ocket(TcpDiscoverySpi.java:1437) > > at org.apache.ignite.spi.discovery.tcp.ServerImpl$ClientMessage > Worker.processMessage(ServerImpl.java:6500) > > at org.apache.ignite.spi.discovery.tcp.ServerImpl$ClientMessage > Worker.processMessage(ServerImpl.java:6383) > > at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker > Adapter.body(ServerImpl.java:6648) > > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > > [2018-01-23 > 12:43:30,983][DEBUG][tcp-disco-client-message-worker-#64%igniteCosco%][TcpDiscoverySpi] > Grid runnable finished due to interruption without cancellation: > tcp-disco-client-message-worker-#64%igniteCosco% > > > > > > And client side error message is: > > > > [2018-01-23 > 12:43:55,713][ERROR][tcp-client-disco-sock-reader-#3%igniteCosco%][TcpDiscoverySpi] > Failed to read message > [sock=Socket[addr=/172.40.0.84,port=47500,localport=3871], > locNodeId=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9, > rmtNodeId=89865e33-c722-4989-b663-a75c936b068f] > > class org.apache.ignite.IgniteCheckedException: Failed to deserialize > object with given class loader: WebappClassLoader > > context: /COSNLOGIS > > delegate: false > > repositories: > > /WEB-INF/classes/ > > ----------> Parent Classloader: > > java.net.URLClassLoader@22db19d3 > > > > at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(Jd > kMarshaller.java:129) > > at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller > .unmarshal(AbstractNodeNameAwareMarshaller.java:94) > > at org.apache.ignite.internal.util.IgniteUtils.unmarshal(Ignite > Utils.java:9740) > > at org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketReader. > body(ClientImpl.java:1001) > > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > > Caused by: java.io.EOFException > > at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectIn > putStream.java:2304) > > at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody( > ObjectInputStream.java:3042) > > at java.io.ObjectInputStream$BlockDataInputStream.readUTF(Objec > tInputStream.java:2843) > > at java.io.ObjectInputStream.readString(ObjectInputStream.java:1617) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1338) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at org.apache.ignite.internal.util.IgniteUtils.readMap(IgniteUt > ils.java:5146) > > at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNod > e.readExternal(TcpDiscoveryNode.java:617) > > at java.io.ObjectInputStream.readExternalData(ObjectInputStream > .java:1816) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1775) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at java.util.ArrayList.readObject(ArrayList.java:733) > > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe > thodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:601) > > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass > .java:1004) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1872) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea > m.java:1970) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1894) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2403) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2418) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2344) > > at java.util.TreeMap.readObject(TreeMap.java:2290) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce > ssorImpl.java:57) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe > thodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:601) > > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass > .java:1004) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1872) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea > m.java:1970) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1894) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(Jd > kMarshaller.java:121) > > ... 4 more > > [2018-01-23 > 12:43:55,724][ERROR][tcp-client-disco-sock-reader-#3%igniteCosco%][TcpDiscoverySpi] > Connection failed [sock=Socket[addr=/172.40.0.84,port=47500,localport=3871], > locNodeId=ed9d5cc9-11d3-4bfd-a648-1ae718ac16f9] > > java.io.EOFException > > at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectIn > putStream.java:2304) > > at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody( > ObjectInputStream.java:3042) > > at java.io.ObjectInputStream$BlockDataInputStream.readUTF(Objec > tInputStream.java:2843) > > at java.io.ObjectInputStream.readString(ObjectInputStream.java:1617) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1338) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at org.apache.ignite.internal.util.IgniteUtils.readMap(IgniteUt > ils.java:5146) > > at org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNod > e.readExternal(TcpDiscoveryNode.java:617) > > at java.io.ObjectInputStream.readExternalData(ObjectInputStream > .java:1816) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1775) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at java.util.ArrayList.readObject(ArrayList.java:733) > > at sun.reflect.GeneratedMethodAccessor53.invoke(Unknown Source) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe > thodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:601) > > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass > .java:1004) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1872) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea > m.java:1970) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1894) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2403) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2418) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2386) > > at java.util.TreeMap.buildFromSorted(TreeMap.java:2344) > > at java.util.TreeMap.readObject(TreeMap.java:2290) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce > ssorImpl.java:57) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe > thodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:601) > > at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass > .java:1004) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1872) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea > m.java:1970) > > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1894) > > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre > am.java:1777) > > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) > > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) > > at org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(Jd > kMarshaller.java:121) > > at org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller > .unmarshal(AbstractNodeNameAwareMarshaller.java:94) > > at org.apache.ignite.internal.util.IgniteUtils.unmarshal(Ignite > Utils.java:9740) > > at org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketReader. > body(ClientImpl.java:1001) > > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > > [2018-01-23 > 12:43:55,724][DEBUG][tcp-client-disco-msg-worker-#4%igniteCosco%][TcpDiscoverySpi] > Connection closed, will try to restore connection. > > [2018-01-23 > 12:43:55,730][DEBUG][tcp-client-disco-reconnector-#5%igniteCosco%][TcpDiscoverySpi] > Started reconnect process [join=true, timeout=120000] > > [2018-01-23 > 12:43:55,730][DEBUG][tcp-client-disco-reconnector-#5%igniteCosco%][TcpDiscoverySpi] > Resolved addresses from IP finder: [/172.40.0.84:47500] > > > > > > After restart the server, everything is OK but cannot be connected one or > two days later. Is there any idea about these situation? Attached is the > server config file, and can provide debug log file if needed. > > > > Thank you! > > > > Regards > > Jerry > > > > >