Alexandr Kuramshin created IGNITE-7134:
------------------------------------------
Summary: Never-ending timeout in
IgniteSpiOperationTimeoutHelper.nextTimeoutChunk()
Key: IGNITE-7134
URL: https://issues.apache.org/jira/browse/IGNITE-7134
Project: Ignite
Issue Type: Bug
Components: general
Affects Versions: 2.3
Reporter: Alexandr Kuramshin
Priority: Critical
Fix For: 2.4
{noformat}
org.apache.ignite.spi.IgniteSpiOperationTimeoutHelper#nextTimeoutChunk
long curTs = U.currentTimeMillis();
timeout = timeout - (curTs - lastOperStartTs);
{noformat}
Timeout will not be decreased at all if delay between successive calls to
nextTimeoutChunk() is smaller than U.currentTimeMillis() discretization. Such
behaviour could be easily achieved when getting an error right after the
nextTimeoutChunk() invocation and do the retry.
Only rare calls (the first right before U.currentTimeMillis() and the second
right after that) may decrease timeout, so actual
IgniteSpiOperationTimeoutHelper timeout could be much bigger than the
failureDetectionTimeout.
My opinion to not split failureDetectionTimeout between network operations, but
initialize first operation timestamp at first call to nextTimeoutChunk(), and
then calculate the timeout as a difference between the current timestamp and
the first operation timestamp.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)