Carl Sandland created CASSSIDECAR-354:
-----------------------------------------
Summary: cassinstancesdown/up metrics not updating when instances
go down
Key: CASSSIDECAR-354
URL: https://issues.apache.org/jira/browse/CASSSIDECAR-354
Project: Sidecar for Apache Cassandra
Issue Type: Bug
Components: Observability
Reporter: Carl Sandland
When stopping a cassandra 'instance', sidecar is not updating these metrics
correctly, as the onFailure() block that does the updates is not being called,
due to exceptions being swallowed in the degelate. Exceptions being swallowed
doesn't seem to work well with promise chains.
My expectations where:
Assume a simple sidecar config with one attached cassandra instance, all
started up and running happily: cassinstancesdown = 0, cassinstancesup = 1.
Then manually stop cassandra: cassinstancesdown = 1, cassinstancesup = 0
I was seeing a constant : cassinstancesdown=0, cassinstancesup=1
Specifically, the code here:
{code:java}
private Future<Void> healthCheck(InstanceMetadata instanceMetadata,
AtomicInteger instanceDown)
{
return internalPool
.runBlocking(() -> instanceMetadata.delegate().healthCheck(), false)
.onFailure(cause -> {
instanceDown.incrementAndGet();
LOGGER.error("Unable to complete health check on instance={}",
instanceMetadata.id(), cause);
});
} {code}
the metric is updated in the onFailure(), yet the exceptions that would trigger
a failure (like not being able to connect) are swallowed by the delegate
(CassandraAdapterDelegate) healthCheck() call.
I experimented by re-throwing the exceptions in the delegate and the metric
started tracking correctly. There is quite a lot of state change in the
delegate in the exception handlers so didn't feel comfortable 'throwing' a
simplistic PR out.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]