[jira] [Commented] (CASSANDRA-20333) Reduce DecayingEstimatedHistogramReservoir update cost

Maxim Muzafarov (Jira) Wed, 26 Feb 2025 05:08:07 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930674#comment-17930674
 ]


Maxim Muzafarov commented on CASSANDRA-20333:
---------------------------------------------

This makes the picture clear to me, I'd like to help fix this issue properly 
and move it forward, and prepare the changes with the sync to Dmitry's work on 
the related ThreadLocal Metrics issue.
I've talked to Dmitry and will try to prepare the PR for this soon, probably 
this weekend. 
I'm assigning the issue to myself, I hope you also don't mind, Benedict.

> Reduce DecayingEstimatedHistogramReservoir update cost
> ------------------------------------------------------
>
>                 Key: CASSANDRA-20333
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20333
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Observability/Metrics
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>
> Based on the discussions in CASSANDRA-20250
> [~benedict]:
> {quote}We can probably improve our reservoir performance if we want to, 
> perhaps in a follow-up patch? For instance, we could have a small 
> thread-local buffer of (time, latency) pairs that we periodically flush 
> together, so that we amortise the memory latency costs. Or we could explore 
> maintaining a per-thread HdrHistogram, that we periodically flush. This would 
> be a good time to explore fully migrating to HdrHistogram, as it has built-in 
> merge semantics iirc. I am not sure what the decayed version would look like 
> there, but I am certain we could maintain a separate decayed HdrHistogram.
> Having a thread-local buffer of updates we intend to flush to the histograms 
> would amortise the latency penalties without fundamentally redesigning 
> anything (as well as reducing contention).
> Other possibilities might include e.g. changing the bucket distribution so we 
> don't need a LUT for computing lg2, although the above would gracefully 
> handle any contribution this has as well.
> {quote}
>  
> Other ideas about squeezing extra bits from the current design:
>  * bucket id can be calculated once (currently we do it 2 times for decaying 
> and current buckets), like:
> {code:java}
> int stripe = (int) (Thread.currentThread().getId() & (nStripes - 1));
> int bucket = stripedIndex(index, stripe);
> rescaledDecayingBuckets.update(bucket, now);
> updateBucket(buckets, bucket, 1); {code}
>  * for histograms on highly loaded paths we can use another number of stripes 
> (by default it is 2, we can set for example 4 for them)
>  * I noticed some variation in performance for a micro-benchmark (existing 
> one: DecayingEstimatedHistogramBench) depending on what exact value for 
> distributionPrime is used (but I need to double check it)
>  * forwardDecayWeight function depends on SampledClock value, so we can try 
> to recalculate the weight only when time is changed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-20333) Reduce DecayingEstimatedHistogramReservoir update cost

Reply via email to