Hi, 

 Your discovery section is non-standard. You don't need to repeat the entries.
Also best if you prepend the local/loopback ip to be on the safe side.


    <property name="addresses">
                            <list>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                                <value>:47500..47509</value>
                            </list>
                        </property>

The same discovery section is  on the client side



Also the thread pool config section is on the large side 

        <property name="systemThreadPoolSize" value="128"/>
        <property name="publicThreadPoolSize" value="128"/>
        <property name="queryThreadPoolSize" value="128"/>
        <property name="serviceThreadPoolSize" value="128"/>
        <property name="stripedPoolSize" value="128"/>
        <property name="dataStreamerThreadPoolSize" value="64"/>
        <property name="rebalanceThreadPoolSize" value="8"/>

It will create many unnecessary threads unless you have a specific need for 
that.
https://apacheignite.readme.io/docs/performance-tips#section-configure-thread-pools


The root cause of the stoppage looks to be incorrectly configured 
library/serialization issue with joda objects

Caused by: org.apache.ignite.IgniteException: Failed to create string 
representation of binary object.
        at 
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:189)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryObjectImpl.toString(BinaryObjectImpl.java:920)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at java.lang.String.valueOf(String.java:2994) ~[?:1.8.0_201]
        at 
org.apache.ignite.internal.util.GridStringBuilder.a(GridStringBuilder.java:101) 
~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.util.tostring.SBLimitedLength.a(SBLimitedLength.java:88)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.util.tostring.GridToStringBuilder.toString(GridToStringBuilder.java:939)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.util.tostring.GridToStringBuilder.toStringImpl(GridToStringBuilder.java:1005)
 ~[ignite-core-2.7.6.jar:2.7.6]
        ... 26 more
Caused by: org.apache.ignite.binary.BinaryObjectException: Failed to read 
field: iChronology
        at 
org.apache.ignite.internal.binary.BinaryReaderExImpl.wrapFieldException(BinaryReaderExImpl.java:447)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryReaderExImpl.unmarshalField(BinaryReaderExImpl.java:344)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryObjectImpl.field(BinaryObjectImpl.java:626)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:225)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryObjectExImpl.appendValue(BinaryObjectExImpl.java:280)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryObjectExImpl.toString(BinaryObjectExImpl.java:229)
 ~[ignite-core-2.7.6.jar:2.7.6]

Caused by: org.apache.ignite.IgniteCheckedException: Failed to find class with 
given class loader for unmarshalling (make sure same versions of all classes 
are available on all nodes or enable peer-class-loading) 
[clsLdr=sun.misc.Launcher$AppClassLoader@764c12b6, 
cls=org.joda.time.chrono.ISOChronology$Stub]
        at 
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.unmarshal0(OptimizedMarshaller.java:233)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:94)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryUtils.doReadOptimized(BinaryUtils.java:1762)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.binary.BinaryUtils.unmarshal(BinaryUtils.java:1971) 
~[ignite-core-2.7.6.jar:2.7.6]


looks like this library is missing/incorrectly configured in your setup
Caused by: java.lang.ClassNotFoundException: 
org.joda.time.chrono.ISOChronology$Stub
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382) 
~[?:1.8.0_201]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_201]
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) 
~[?:1.8.0_201]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[?:1.8.0_201]
        at java.lang.Class.forName0(Native Method) ~[?:1.8.0_201]
        at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_201]
        at 
org.apache.ignite.internal.util.IgniteUtils.forName(IgniteUtils.java:8775) 
[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.MarshallerContextImpl.getClass(MarshallerContextImpl.java:349)
 ~[ignite-core-2.7.6.jar:2.7.6]
        at 
org.apache.ignite.internal.marshaller.optimized.OptimizedMarshallerUtils.classDescriptor(OptimizedMarshallerUtils.java:264)
 ~[ignite-core-2.7.6.jar:2.7.6]


[2019-10-25T02:10:34,352][ERROR][sys-stripe-50-#51][] JVM will be halted 
immediately due to the failure: [failureCtx=FailureContext 
[type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to create string 
representation of binary object.]]

There is stub in this class that implements serializable, causing ignite to use 
a different marshaller, and leading to error.
Joda class in question: look on bottom
https://github.com/JodaOrg/joda-time/blob/master/src/main/java/org/joda/time/chrono/ISOChronology.java

information about binary marshalling ignite:
https://apacheignite.readme.io/docs/binary-marshaller

specific marshaller used in this case.
https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/marshaller/optimized/OptimizedMarshaller.java

Make sure all nodes have this library setup, and that IGNITE_HOME is properly 
configured.

Thanks, Alex

-----Original Message-----
From: ihalilaltun <[email protected]> 
Sent: Friday, October 25, 2019 10:28 AM
To: [email protected]
Subject: Ignite node failure after network issue

Hi Igniters,

We had a network glitch last night and one node halted itself. Both client and 
node logs are attached, can someone have a look and tell me the exact problem 
here;

Archive.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/t2515/Archive.zip>  

We are on version 2.7.6. Out client application runs on spring-boot v2.0.6

I have been searching all over the Apache Ignite online sources to find what 
should be the best practices for network problem/s handling on both server and 
client side, if someone has such a source, I would be happy to read it.

Regards.



-----
İbrahim Halil Altun
Senior Software Engineer @ Segmentify
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to