Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-12 Thread Benedict
It sounds like for the original query we have a broad consensus:1) Deprecate Codahale, but for the next major version publish compatible metrics2) After the next release, move to a codahale-like registry that allows us to be efficient without abusing unsafe, and continue publishing metrics that imp

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-11 Thread Jon Haddad
Absolutely, happy to share. All tests were done using easy-cass-stress v9 and easy-cass-lab, with the latest released 5.0 (not including 15452 or 20092). Instructions at the end. > Regarding allocation rate vs throughput, unfortunately allocation rate vs throughput are not connected linearly, Y

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-11 Thread Dmitry Konstantinov
Jon, thank you for testing!, can you share your CPU profile and test load details? Have you tested it with CASSANDRA-20092 changes included? >> Allocations related to codahale were < 1%. Just to clarify: in the initial mail by memory footprint I mean the static amount of memory used to store metri

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-11 Thread Jon Haddad
Definitely +1 on registry + docs. I believe that's part of the OTel Java SDK [1][2] I did some performance testing yesterday and was able to replicate the findings where the codahale code path took 7-10% of CPU time. The only caveat is that it only happens with compaction disabled. Once compact

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-11 Thread Chris Lohfink
Just something to be mindful about what we had *before* codahale in Cassandra and avoid that again. Pre 1.1 it was pretty much impossible to collect metrics without looking at code (there were efficient custom made things, but each metric was reported differently) and that stuck through until 2.2 d

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-11 Thread Josh McKenzie
> Having something like a registry and standardizing/enforcing all metric types > is something we should be sure to maintain. A registry w/documentation on each metric indicating *what it's actually measuring and what it means* would be great for our users. On Mon, Mar 10, 2025, at 3:46 PM, Chri

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-07 Thread Jon Haddad
As long as operators are able to use all the OTel tooling, I'm happy. I'm not looking to try to decide what the metrics API looks like, although I think trying to plan for 15 years out is a bit unnecessary. A lot of the DB will be replaced by then. That said, I'm mostly hands off on code and you

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Patrick McFadin
We can also do an education campaign to get people to migrate. There will be good reasons to do it. On Wed, Mar 5, 2025 at 12:33 PM Jeff Jirsa wrote: > > I think widely accepted that otel in general has won this stage of > observability, as most metrics systems allow it and most saas providers

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Maxim Muzafarov
If we do swap, we may run into the same issues with third-party metrics libraries in the next 10-15 years that we are discussing now with the Codahale we added ~10-15 years ago, and given the fact that a proposed new API is quite small my personal feeling is that it would be our best choice for the

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Jeff Jirsa
I think widely accepted that otel in general has won this stage of observability, as most metrics systems allow it and most saas providers support it. So Jon’s point there is important. The promise of unifying logs/traces/metrics usually (aka wide events) is far more important in the tracing side o

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread C. Scott Andreas
No strong opinion on particular choice of metrics library.My primary feedback is that if we swap metrics implementations and the new values are *different*, we can anticipate broad user confusion/interest.In particular if latency stats are reported higher post-upgrade, we should expect users to int

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Benedict
I really like the idea of integrating tracing, metrics and logging frameworks.I would like to have the time to look closely at the API before we decide to adopt it though. I agree that a widely deployed API has inherent benefits, but any API we adopt also shapes future evolution of our capabilities

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Jon Haddad
Thank you for the replies. Dmitry: Based on some other patches you've worked on and your explanation here, it looks like you're optimizing the front door portion of write path - very cool. Testing it in isolation with those settings makes sense if your goal is to push write throughput as far as y

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Josh McKenzie
> if the plan is to rip out something old and unmaintained and replace with > something new, I think there's a huge win to be had by implementing the > standard that everyone's using now. Strong +1 on anything that's an ecosystem integration inflection point. The added benefit here is that if we

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Dmitry Konstantinov
Hi Jon >> Is there a specific workload you're running where you're seeing it take up a significant % of CPU time? Could you share some metrics, profile data, or a workload so I can try to reproduce your findings? Yes, I have shared the workload generation command (sorry, it is in cassandra-stres

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-05 Thread Benedict Elliott Smith
Some quick thoughts of my own… === Performance === - I have seen heap dumps with > 1GiB dedicated to metric counters. This patch should improve this, while opening up room to cut it further, steeply. - The performance improvement in relative terms for the metrics being replaced is rather dramati

Re: Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-04 Thread Jon Haddad
I've got a few thoughts... On the performance side, I took a look at a few CPU profiles from past benchmarks and I'm seeing DropWizard taking ~ 3% of CPU time. Is there a specific workload you're running where you're seeing it take up a significant % of CPU time? Could you share some metrics, pr

Dropwizard/Codahale metrics deprecation in Cassandra server

2025-03-04 Thread Dmitry Konstantinov
Hi all, After a long conversation with Benedict and Maxim in CASSANDRA-20250 I would like to raise and discuss a proposal to deprecate Dropwizard/Codahale metrics usage in the next major release of Cassandra server and drop it in the followin