[jira] [Created] (CASSSIDECAR-354) cassinstancesdown/up metrics not updating when instances go down

Carl Sandland (Jira) Sat, 18 Oct 2025 04:29:31 -0700

Carl Sandland created CASSSIDECAR-354:
-----------------------------------------


             Summary: cassinstancesdown/up metrics not updating when instances 
go down
                 Key: CASSSIDECAR-354
                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-354
             Project: Sidecar for Apache Cassandra
          Issue Type: Bug
          Components: Observability
            Reporter: Carl Sandland


When stopping a cassandra 'instance', sidecar is not updating these metrics 
correctly, as the onFailure() block that does the updates is not being called, 
due to exceptions being swallowed in the degelate. Exceptions being swallowed 
doesn't seem to work well with promise chains.

My expectations where:

Assume a simple sidecar config with one attached cassandra instance, all 
started up and running happily: cassinstancesdown = 0, cassinstancesup = 1. 
Then manually stop cassandra: cassinstancesdown = 1, cassinstancesup = 0

I was seeing a constant : cassinstancesdown=0, cassinstancesup=1

Specifically, the code here:
{code:java}
private Future<Void> healthCheck(InstanceMetadata instanceMetadata, 
AtomicInteger instanceDown)
{
    return internalPool
           .runBlocking(() -> instanceMetadata.delegate().healthCheck(), false)
           .onFailure(cause -> {
               instanceDown.incrementAndGet();
               LOGGER.error("Unable to complete health check on instance={}",
                            instanceMetadata.id(), cause);
           });
} {code}
the metric is updated in the onFailure(), yet the exceptions that would trigger 
a failure (like not being able to connect) are swallowed by the delegate 
(CassandraAdapterDelegate) healthCheck() call.

I experimented by re-throwing the exceptions in the delegate and the metric 
started tracking correctly. There is quite a lot of state change in the 
delegate in the exception handlers so didn't feel comfortable 'throwing' a 
simplistic PR out.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSSIDECAR-354) cassinstancesdown/up metrics not updating when instances go down

Reply via email to