[jira] [Commented] (CASSANDRA-19651) idealCLWriteLatency metric reports the worst response time instead of the time when ideal CL is satisfied

Ariel Weisberg (Jira) Thu, 22 Aug 2024 12:41:46 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876087#comment-17876087
 ]


Ariel Weisberg commented on CASSANDRA-19651:
--------------------------------------------

We have `spinAssertEquals` for this purpose. I'm not 100% sure the issue is 
that we aren't waiting, if `Instance.startup` is finished should that Gossip 
state not be present? It may be that we have schema and gossip state from some, 
but not all nodes once startup is done. I would just double check to make sure 
startup is working as intended in terms of waiting until it's received enough 
state from Gossip that it is actually done and there isn't some other root 
cause bug here.

See `Gossiper.waitToSettle` for context.

In 5.1 all those checks are removed and it just checks the underlying table to 
see if the data is there. I wonder why it needs to do the `pullSchemaFrom`.

> idealCLWriteLatency metric reports the worst response time instead of the 
> time when ideal CL is satisfied
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19651
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Observability
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 4.0.14, 5.0.1, 5.1, 4.1.7
>
>         Attachments: 
> ci_summary-cassandra-4.0-a75f6c3e81f677e50c0a0d467dd5dad672f923e3.html, 
> ci_summary-cassandra-4.1-1ed312f881c0c170c8833ff9fbf397ab8fc625cc.html, 
> ci_summary-cassandra-5.0-009f2982ac88d9c9bc0a7a7d29220f055aa7f11e.html, 
> ci_summary-trunk-da68729322515b4a7a698b73a0154ecdeb3abf39.html, 
> result_details-cassandra-4.0-a75f6c3e81f677e50c0a0d467dd5dad672f923e3.tar.gz, 
> result_details-cassandra-5.0-009f2982ac88d9c9bc0a7a7d29220f055aa7f11e.tar.gz, 
> result_details-trunk-da68729322515b4a7a698b73a0154ecdeb3abf39.tar.gz, 
> select-junit-tests-rerun-4.1.zip
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> org.apache.cassandra.service.AbstractWriteResponseHandler:
> {code:java}
> private final void decrementResponseOrExpired()
> {
>     int decrementedValue = responsesAndExpirations.decrementAndGet();
>     if (decrementedValue == 0)
>     {
>         // The condition being signaled is a valid proxy for the CL being 
> achieved
>         // Only mark it as failed if the requested CL was achieved.
>         if (!condition.isSignalled() && requestedCLAchieved)
>         {
>             replicaPlan.keyspace().metric.writeFailedIdealCL.inc();
>         }
>         else
>         {
>             
> replicaPlan.keyspace().metric.idealCLWriteLatency.addNano(nanoTime() - 
> queryStartNanoTime);
>         }
>     }
> } {code}
> Actual result: responsesAndExpirations is a total number of replicas across 
> all DCs which does not depend on the ideal CL, so the metric value for 
> replicaPlan.keyspace().metric.idealCLWriteLatency is updated when we get the 
> latest response/timeout for all replicas.
> Expected result: replicaPlan.keyspace().metric.idealCLWriteLatency is updated 
> when we get enough responses from replicas according to the ideal CL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19651) idealCLWriteLatency metric reports the worst response time instead of the time when ideal CL is satisfied

Reply via email to