Ilya Kasnacheev created IGNITE-14711:
----------------------------------------

             Summary: Client discovery thread interrupt/stop causes endless 
communication reconnect attempt
                 Key: IGNITE-14711
                 URL: https://issues.apache.org/jira/browse/IGNITE-14711
             Project: Ignite
          Issue Type: Bug
          Components: networking
    Affects Versions: 2.10
            Reporter: Ilya Kasnacheev


Original issue: if tcp-client-disco-sock-reader thread dies on client node, it 
will never disconnect from the cluster despite NODE_FAILED, and will endlessly 
try to open communication connections to server while getting "Remote node does 
not observe current node in topology" exceptions on client and "Close incoming 
connection, unknown node" on server.

Generalized issue: stop()ing or interrupt()ing discovery threads cause cluster 
to hang in many cases, where it is expected that any such node will:
* Restart the thread and continue normally
* Disconnect from the cluster to re-establish discovery connection
* Stop and close all remaining threads.

See the attached reproducer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to