[ 
https://issues.apache.org/jira/browse/IGNITE-13504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wouter Bancken updated IGNITE-13504:
------------------------------------
    Description: 
*Details*

We're running Ignite 2.8 in-process and we are experiencing the following error:
{code:java}
 2020-09-30 15:35:04.357 ERROR 6 --- [e-c4925392fef9%] 
o.a.i.spi.discovery.tcp.TcpDiscoverySpi : Failed to accept TCP connection.
java.net.SocketTimeoutException: Accept timed out
at java.base/java.net.PlainSocketImpl.socketAccept(Native Method)
at 
java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458)
at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565)
at java.base/java.net.ServerSocket.accept(ServerSocket.java:533)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:6353)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServerThread.body(ServerImpl.java:6276)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61){code}
Ignite considers this SocketTimeoutException in 
[ServerImpl|https://github.com/apache/ignite/blob/2.8.1/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L6394]
 to be a critical error and as a result the StopNodeOrHaltFailureHandler shuts 
down the JVM:
{code:java}
 2020-09-30 15:35:18.715 ERROR 6 --- [e-c4925392fef9%] : Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.net.SocketTimeoutException: Accept timed out]]{code}
Currently there seems to be no way to avoid this behaviour since ServerImpl 
creates a native socket without configuring a socket timeout so it is fully 
dependent on the underlying OS.

The timeout is triggered when we are executing a separate action on the system 
that is executing a native method (loading fonts).

*Notes*

When setting a larger socket timeout of 10 seconds during debugging, the 
SocketTimeoutException no longer occurred. However, this is not configurable in 
Ignite so this is not an actual solution.

*Example*

The following code base demonstrates the issue: 

[https://github.com/WouterBancken/ignite-crash-demo/blob/master/service/src/main/java/demo/testdocker/LocalIgniteServerConfiguration.java]
 

  was:
*Details*

We're running Ignite 2.8 in-process and we are experiencing the following error:
{code:java}
 2020-09-30 15:35:04.357 ERROR 6 --- [e-c4925392fef9%] 
o.a.i.spi.discovery.tcp.TcpDiscoverySpi : Failed to accept TCP connection.
java.net.SocketTimeoutException: Accept timed out
at java.base/java.net.PlainSocketImpl.socketAccept(Native Method)
at 
java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458)
at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565)
at java.base/java.net.ServerSocket.accept(ServerSocket.java:533)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:6353)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServerThread.body(ServerImpl.java:6276)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61){code}
Ignite considers this SocketTimeoutException in 
[ServerImpl|https://github.com/apache/ignite/blob/2.8.1/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L6394]
 to be a critical error and as a result the StopNodeOrHaltFailureHandler shuts 
down the JVM:
{code:java}
 2020-09-30 15:35:18.715 ERROR 6 --- [e-c4925392fef9%] : Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.net.SocketTimeoutException: Accept timed out]]{code}
Currently there seems to be no way to avoid this behaviour since ServerImpl 
creates a native socket without configuring a socket timeout so it is fully 
dependent on the underlying OS.

The timeout is triggered when we are executing a separate action on the system 
that is executing a native method (loading fonts).

*Notes*
 * When setting a larger socket timeout of 10 seconds during debugging, the 
SocketTimeoutException no longer occurred. This is not configurable in Ignite.

*Example*

The following code base demonstrates the issue: 

[https://github.com/WouterBancken/ignite-crash-demo/blob/master/service/src/main/java/demo/testdocker/LocalIgniteServerConfiguration.java]
 


> ServerImpl shuts down JVM after short timeout
> ---------------------------------------------
>
>                 Key: IGNITE-13504
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13504
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.8.1
>            Reporter: Wouter Bancken
>            Priority: Major
>
> *Details*
> We're running Ignite 2.8 in-process and we are experiencing the following 
> error:
> {code:java}
>  2020-09-30 15:35:04.357 ERROR 6 --- [e-c4925392fef9%] 
> o.a.i.spi.discovery.tcp.TcpDiscoverySpi : Failed to accept TCP connection.
> java.net.SocketTimeoutException: Accept timed out
> at java.base/java.net.PlainSocketImpl.socketAccept(Native Method)
> at 
> java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458)
> at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565)
> at java.base/java.net.ServerSocket.accept(ServerSocket.java:533)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:6353)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServerThread.body(ServerImpl.java:6276)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61){code}
> Ignite considers this SocketTimeoutException in 
> [ServerImpl|https://github.com/apache/ignite/blob/2.8.1/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L6394]
>  to be a critical error and as a result the StopNodeOrHaltFailureHandler 
> shuts down the JVM:
> {code:java}
>  2020-09-30 15:35:18.715 ERROR 6 --- [e-c4925392fef9%] : Critical system 
> error detected. Will be handled accordingly to configured handler 
> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
> failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
> err=java.net.SocketTimeoutException: Accept timed out]]{code}
> Currently there seems to be no way to avoid this behaviour since ServerImpl 
> creates a native socket without configuring a socket timeout so it is fully 
> dependent on the underlying OS.
> The timeout is triggered when we are executing a separate action on the 
> system that is executing a native method (loading fonts).
> *Notes*
> When setting a larger socket timeout of 10 seconds during debugging, the 
> SocketTimeoutException no longer occurred. However, this is not configurable 
> in Ignite so this is not an actual solution.
> *Example*
> The following code base demonstrates the issue: 
> [https://github.com/WouterBancken/ignite-crash-demo/blob/master/service/src/main/java/demo/testdocker/LocalIgniteServerConfiguration.java]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to