Mikhail Petrov created IGNITE-19715:
---------------------------------------

             Summary: Thin client operations can take a long time if PA is 
enabled and some cluster nodes are not network reachable.
                 Key: IGNITE-19715
                 URL: https://issues.apache.org/jira/browse/IGNITE-19715
             Project: Ignite
          Issue Type: Bug
            Reporter: Mikhail Petrov


Thin client operations can take a long time if PA is enabled and some cluster 
nodes are not reachable over network.

Consider the following scenario:

1. The thin client have already sucessfully established connection to all 
configured node addresses.
2. A particular cluster node becomes unreachable over network. It can be 
reproduced with iptables -A INPUT -p tcp --dport for Linux.
3. The thin client periodically sends put request which is mapped by PA to the 
unreachable node.
4. Firstly  all attempts to perform put will lead to `ClientException: Timeout 
was reached before computation completed.` exception. But eventually the 
connection to the unreachable node will be closed by OS (see tcp_keepalive_time 
for Linux).

This will lead to reestablishing connection to the unreachable node during the 
next put (see ReliableChannel.java:1012)

We currently do not set a timeout for the open connection operatiWe currently 
do not set a timeout for the open connection operation.on (see 
GridNioClientConnectionMultiplexer#open, here we use Integer.MAX_VALUE for )

As a result put operation hangs for a significant amount of time and ignores 
the ClientConfiguration#setTimeout property.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to