[
https://issues.apache.org/jira/browse/CASSANDRA-21025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sam Tunnicliffe updated CASSANDRA-21025:
----------------------------------------
Reviewers: Maxim Muzafarov, Stefan Miklosovic, Sam Tunnicliffe
Maxim Muzafarov, Stefan Miklosovic, Sam Tunnicliffe (was: Maxim
Muzafarov, Sam Tunnicliffe, Stefan Miklosovic)
Status: Review In Progress (was: Patch Available)
> Failure detector max interval value is calculated incorrectly
> -------------------------------------------------------------
>
> Key: CASSANDRA-21025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21025
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Cluster/Gossip
> Reporter: Sam Tunnicliffe
> Assignee: Sam Tunnicliffe
> Priority: Normal
> Fix For: 5.0.x
>
>
> If this setting is not overridden via the {{cassandra.fd_max_interval_ms}}
> system property ({{{}CassandraRelevantProperties.FD_MAX_INTERVAL_MS{}}}),
> then it is seeded with the value of
> {{{}FailureDetector.INITIAL_VALUE_NANOS{}}}.
> However, a bug in the logic of
> {{FailureDetector$ArrivalWindow::getMaxInterval}} means in this case there is
> an incorrect conversion between time units.
> {code:java}
> public static long getMaxInterval()
> {
> long newValue =
> FD_MAX_INTERVAL_MS.getLong(FailureDetector.INITIAL_VALUE_NANOS);
> if (newValue != FailureDetector.INITIAL_VALUE_NANOS)
> logger.info("Overriding {} from {}ms to {}ms",
> FD_MAX_INTERVAL_MS.getKey(), FailureDetector.INITIAL_VALUE_NANOS, newValue);
> return TimeUnit.NANOSECONDS.convert(newValue, TimeUnit.MILLISECONDS);
> }
> {code}
> If {{FD_MAX_INTERVAL_MS}} is not set, the supplied default
> {{INITIAL_VALUE_NANOS}} is used, but this is then converted as if it were a
> value in millis, inflating it 1000000x.
> The effective max interval in this case should be 2 seconds, but instead
> becomes 23 days, 3 hours, 33 minutes & 20 seconds.
> The net effect is that intervals way longer than expected can be recorded if
> nodes are intermittently partitioned but not restarted (meaning they retain
> the same gossip generation).
> In turn this can cause the phi calculation to react to those nodes much more
> slowly as the mean arrival time interval is much bigger than expected,
> leaving them marked as {{UP}} when they should be {{{}DOWN{}}}.
> If {{FD_MAX_INTERVAL_MS}} is overridden then the conversion, and so the
> returned value, is correct (assuming an appropriately scaled values is
> supplied, there is no guardrail to ensure that). Versions earlier than 5.0
> are not affected.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]