Hi,
Your discovery section is non-standard. You don't need to repeat the entries.
Also best if you prepend the local/loopback ip to be on the safe side.
<property name="addresses">
<list>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
<value>:47500..47509</value>
</list>
</property>
The same discovery section is on the client side
Also the thread pool config section is on the large side
<property name="systemThreadPoolSize" value="128"/>
<property name="publicThreadPoolSize" value="128"/>
<property name="queryThreadPoolSize" value="128"/>
<property name="serviceThreadPoolSize" value="128"/>
<property name="stripedPoolSize" value="128"/>
<property name="dataStreamerThreadPoolSize" value="64"/>
<property name="rebalanceThreadPoolSize" value="8"/>
It will create many unnecessary threads unless you have a specific need for
that.
https://apacheignite.readme.io/docs/performance-tips#section-configure-thread-pools
The root cause of the stoppage looks to be incorrectly configured
library/serialization issue with joda objects
Caused by: org.apache.ignite.IgniteException: Failed to create string
representation of binary object.
at
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:189)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryObjectImpl.toString(BinaryObjectImpl.java:920)
~[ignite-core-2.7.6.jar:2.7.6]
at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_201]
at
org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005)
~[ignite-core-2.7.6.jar:2.7.6]
... 26 more
Caused by: org.apache.ignite.binary.BinaryObjectException: Failed to read
field: iChronology
at
org.apache.ignite.internal.binary.BinaryReaderExImpl.wrapFieldException(BinaryReaderExImpl.java:447)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryReaderExImpl.unmarshalField(BinaryReaderExImpl.java:344)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryObjectImpl.field(BinaryObjectImpl.java:626)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:225)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
~[ignite-core-2.7.6.jar:2.7.6]
Caused by: org.apache.ignite.IgniteCheckedException: Failed to find class with
given class loader for unmarshalling (make sure same versions of all classes
are available on all nodes or enable peer-class-loading)
[clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6,
cls=org.joda.time.chrono.ISOChronology$Stub]
at
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.unmarshal0(OptimizedMarshaller.java:233)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:94)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryUtils.doReadOptimized(BinaryUtils.java:1762)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1971)
~[ignite-core-2.7.6.jar:2.7.6]
looks like this library is missing/incorrectly configured in your setup
Caused by: java.lang.ClassNotFoundException:
org.joda.time.chrono.ISOChronology$Stub
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
~[?:1.8.0_201]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_201]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
~[?:1.8.0_201]
at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_201]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_201]
at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_201]
at
org.apache.ignite.internal.util.IgniteUtils.forName(IgniteUtils.java:8775)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.MarshallerContextImpl.getClass(MarshallerContextImpl.java:349)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshallerUtils.classDescriptor(OptimizedMarshallerUtils.java:264)
~[ignite-core-2.7.6.jar:2.7.6]
[2019-10-25T02:10:34,352][ERROR][sys-stripe-50-#51][] JVM will be halted
immediately due to the failure: [failureCtx=FailureContext
[type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to create string
representation of binary object.]]
There is stub in this class that implements serializable, causing ignite to use
a different marshaller, and leading to error.
Joda class in question: look on bottom
https://github.com/JodaOrg/joda-time/blob/master/src/main/java/org/joda/time/chrono/ISOChronology.java
information about binary marshalling ignite:
https://apacheignite.readme.io/docs/binary-marshaller
specific marshaller used in this case.
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/marshaller/optimized/OptimizedMarshaller.java
Make sure all nodes have this library setup, and that IGNITE_HOME is properly
configured.
Thanks, Alex
-----Original Message-----
From: ihalilaltun <[email protected]>
Sent: Friday, October 25, 2019 10:28 AM
To: [email protected]
Subject: Ignite node failure after network issue
Hi Igniters,
We had a network glitch last night and one node halted itself. Both client and
node logs are attached, can someone have a look and tell me the exact problem
here;
Archive.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/t2515/Archive.zip>
We are on version 2.7.6. Out client application runs on spring-boot v2.0.6
I have been searching all over the Apache Ignite online sources to find what
should be the best practices for network problem/s handling on both server and
client side, if someone has such a source, I would be happy to read it.
Regards.
-----
İbrahim Halil Altun
Senior Software Engineer @ Segmentify
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/