Re: Server stuck while joining cluster

Alexey Kukushkin Sat, 03 Jul 2021 04:54:54 -0700

Check if you have active transactions running on your Server3 at the time
when Servers 1 and 2 are trying to join. Servers wait for active
transactions to complete before joining the cluster and if a transaction
takes longer than the join timeout (10 seconds by default) the joining node
would be segmented out and never join the cluster again until restarted.
You can use the control script to check active transactions:
https://ignite.apache.org/docs/latest/tools/control-script#transaction-management



чт, 1 июл. 2021 г. в 18:20, Joan Pujol <joanpu...@gmail.com>:

> Hi,
>
> I've three servers running tomcat with Ignite 2.10 embedded. If I start
> all the nodes from a cold start (all servers stopped) it works well.
> But if I stop one of the server nodes and restart it again it gets stuck
> joining to the cluster and retrying infinitely with
> WARN  [main] o.a.i.s.d.t.TcpDiscoverySpi - Timed out waiting for message
> delivery receipt and other timeout messages.
>
> Details:
> Server1 (10.114.0.8): Two wars with  Ignite server embedded
> Server2: (10.114.0.13): A war with ignite server embedded
> Server3: (10.114.0.9): A war with Ignite client
>
> The problematic server, is Server1, which gets stuck if it's started while
> the Server3 it's also started.
> If server3 it's stopped then Server1 starts correctly without getting
> stuck.
>
> I attach server log from Server1:
> https://www.dropbox.com/s/2xc4as3qqorq21q/server1.log?dl=0
> And server2: https://www.dropbox.com/s/vtxhhg690aorbvo/server2.log?dl=0
>
> And stacktrace of where the Serv1 gets stuck:
> wait:-1, Object (java.lang)
>
> joinTopology:1179, ServerImpl (org.apache.ignite.spi.discovery.tcp)
> spiStart:472, ServerImpl (org.apache.ignite.spi.discovery.tcp)
> spiStart:2154, TcpDiscoverySpi (org.apache.ignite.spi.discovery.tcp)
> startSpi:278, GridManagerAdapter (org.apache.ignite.internal.managers)
> start:981, GridDiscoveryManager 
> (org.apache.ignite.internal.managers.discovery)
> startManager:1968, IgniteKernal (org.apache.ignite.internal)
> start:1324, IgniteKernal (org.apache.ignite.internal)
> start0:2112, IgnitionEx$IgniteNamedInstance (org.apache.ignite.internal)
> start:1758, IgnitionEx$IgniteNamedInstance (org.apache.ignite.internal)
> start0:1143, IgnitionEx (org.apache.ignite.internal)
> start:663, IgnitionEx (org.apache.ignite.internal)
> start:589, IgnitionEx (org.apache.ignite.internal)
> start:328, Ignition (org.apache.ignite)
>
>
> Configuration can be seen in server1 log but basically nodes are
> configured with multicast but with giving ips of the two server nodes.
>
> Any help or guidance to find the problem will be heavily appreciated.
>
> Cheers,
>
> --
> Joan Jesús Pujol Espinar
>


-- 
Best regards,
Alexey

Re: Server stuck while joining cluster

Reply via email to