Roman Puchkovskiy created IGNITE-17944:
------------------------------------------

             Summary: Logging storm on server node failure/disconnect during 
loading through a DataStreamers
                 Key: IGNITE-17944
                 URL: https://issues.apache.org/jira/browse/IGNITE-17944
             Project: Ignite
          Issue Type: Bug
            Reporter: Roman Puchkovskiy


This can be reproduced in RebalanceIteratorLargeEntriesOOMTest when changing 
#additionalRemoteJvmArgs() to return this:

return Arrays.asList("-Xmx128m", "-Xms128m", "-XX:+HeapDumpOnOutOfMemoryError", 
"-XX:+CrashOnOutOfMemoryError");

On my machine, when the remote node crashes, the machine becomes barely 
responsive because all the cores peak to 100% of load.

There happens a LOT of logging at that phase, probably a lot of Buffers of 
DataStreamers get cancelled, so a lot of futures get cancelled, which is made 
via completing with an exception, which causes a lot of logging.

It seems that same thing might happen in production if a node suddently crashes 
amidst data loading by a client.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to