If we do swap, we may run into the same issues with third-party
metrics libraries in the next 10-15 years that we are discussing now
with the Codahale we added ~10-15 years ago, and given the fact that a
proposed new API is quite small my personal feeling is that it would
be our best choice for the metrics.

Having our own API also doesn't prevent us from having all the
integrations with new 3-rd party libraries the world will develop in
future, just by writing custom adapters to our own -- this will be
possible for the Codahale (with some suboptimal considerations), where
we have to support backwards compatibility, and for the OpenTelemetry
as well. We already have the CEP-32[1] proposal to instrument metrics;
in this sense, it doesn't change much for us.

Another point of having our own API is the virtual tables we have --
it gives us enough flexibility and latitude to export the metrics
efficiently via the virtual tables by implementing the access patterns
we consider important.

[1] 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071749#CEP32:(DRAFT)OpenTelemetryintegration-ExportingMetricsthroughOpenTelemetry
[2 https://opentelemetry.io/docs/languages/java/instrumentation/

On Wed, 5 Mar 2025 at 21:35, Jeff Jirsa <jji...@gmail.com> wrote:
>
> I think widely accepted that otel in general has won this stage of 
> observability, as most metrics systems allow it and most saas providers 
> support it. So Jon’s point there is important.
>
> The promise of unifying logs/traces/metrics usually (aka wide events) is far 
> more important in the tracing side of our observability than in the areas we 
> use Codahale/DropWizard.
>
> Scott: if we swap, we can (probably should) deprecate like everything else, 
> and run both side by side for a release so people don’t lose metrics entirely 
> on bounce? FF both, to control double cost during the transition.
>
>
>
>
> On Mar 5, 2025, at 8:21 PM, C. Scott Andreas <sc...@paradoxica.net> wrote:
>
> No strong opinion on particular choice of metrics library.
>
> My primary feedback is that if we swap metrics implementations and the new 
> values are *different*, we can anticipate broad user confusion/interest.
>
> In particular if latency stats are reported higher post-upgrade, we should 
> expect users to interpret this as a performance regression, dedicating 
> significant resources to investigating the change, and expending credibility 
> with stakeholders in their systems.
>
> - Scott
>
> On Mar 5, 2025, at 11:57 AM, Benedict <bened...@apache.org> wrote:
>
> 
> I really like the idea of integrating tracing, metrics and logging frameworks.
>
> I would like to have the time to look closely at the API before we decide to 
> adopt it though. I agree that a widely deployed API has inherent benefits, 
> but any API we adopt also shapes future evolution of our capabilities. 
> Hopefully this is also a good API that allows us plenty of evolutionary 
> headroom.
>
>
> On 5 Mar 2025, at 19:45, Josh McKenzie <jmcken...@apache.org> wrote:
>
> 
>
> if the plan is to rip out something old and unmaintained and replace with 
> something new, I think there's a huge win to be had by implementing the 
> standard that everyone's using now.
>
> Strong +1 on anything that's an ecosystem integration inflection point. The 
> added benefit here is that if we architect ourselves to gracefully integrate 
> with whatever system's are ubiquitous today, we'll inherit the migration work 
> that any new industry-wide replacement system would need to do to become the 
> new de facto standard.
>
> On Wed, Mar 5, 2025, at 2:23 PM, Jon Haddad wrote:
>
> Thank you for the replies.
>
> Dmitry: Based on some other patches you've worked on and your explanation 
> here, it looks like you're optimizing the front door portion of write path - 
> very cool.  Testing it in isolation with those settings makes sense if your 
> goal is to push write throughput as far as you can, something I'm very much 
> on board with, and is a key component to pushing density and reducing cost.  
> I'm spinning up a 5.0 cluster now to run a test, so I'll run a load test 
> similar to what you've done and try to reproduce your results.  I'll also 
> review the JIRA to get more familiar with what you're working on.
>
> Benedict: I agree with your line of thinking around optimizing the cost of 
> metrics.  As we push both density and multi-tenancy, there's going to be more 
> and more demand for clusters with hundreds or thousands of tables.  Maybe 
> tens of thousands.  Reducing overhead for something that's O(N * M) (multiple 
> counters per table) will definitely be a welcome improvement.  There's always 
> more stuff that's going to get in the way, but it's an elephant and I 
> appreciate every bite.
>
> My main concern with metrics isn't really compatibility, and I don't have any 
> real investment in DropWizard.  I don't know if there's any real value in 
> putting in effort to maintain compatibility, but I'm just one sample, so I 
> won't make a strong statement here.
>
> It would be *very nice* we moved to metrics which implement the Open 
> Telemetry Metrics API [1],  which I think solves multiple issues at once:
>
> * We can use either one of the existing implementations (OTel SDK) or our own
> * We get a "free" upgrade that lets people tap into the OTel ecosystem
> * It paves the way for OTel traces with ZipKin [2] / Jaeger [3]
> * We can use the ubiquitous Otel instrumentation agent to send metrics to the 
> OTel collector, meaning people can collect at a much higher frequency than 
> today
> * OTel logging is a significant improvement over logback, you can coorelate 
> metrics + traces + logs together.
>
> Anyways, if the plan is to rip out something old and unmaintained and replace 
> with something new, I think there's a huge win to be had by implementing the 
> standard that everyone's using now.
>
> All this is very exciting and I appreciate the discussion!
>
> Jon
>
> [1] https://opentelemetry.io/docs/languages/java/api/
> [2] https://zipkin.io/
> [3] https://www.jaegertracing.io/
>
>
>
>
> On Wed, Mar 5, 2025 at 2:58 AM Dmitry Konstantinov <netud...@gmail.com> wrote:
>
> Hi Jon
>
> >>  Is there a specific workload you're running where you're seeing it take 
> >> up a significant % of CPU time?  Could you share some metrics, profile 
> >> data, or a workload so I can try to reproduce your findings?
> Yes, I have shared the workload generation command (sorry, it is in 
> cassandra-stress, I have not yet adopted your tool but want to do it soon :-) 
> ), setup details and async profiler CPU profile in CASSANDRA-20250
> A summary:
>
> it is a plain insert-only workload to assert a max throughput capacity for a 
> single node: ./tools/bin/cassandra-stress "write n=10m" -rate threads=100 
> -node myhost
> small amount of data per row is inserted, local SSD disks are used, so CPU is 
> a primary bottleneck in this scenario (while it is quite synthetic in my real 
> business cases CPU is a primary bottleneck as well)
> I used 5.1 trunk version (similar results I have for 5.0 version while I was 
> checking CASSANDRA-20165)
> I enabled trie memetables + offheap objects mode
> I disabled compaction
> a recent nightly build is used for async-profiler
> my hardware is quite old: on-premise VM, Linux 4.18.0-240.el8.x86_64, 
> OpenJdk-11.0.26+4, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 16 cores
> link to CPU profile ("codahale" code: 8.65%)
> -XX:+DebugNonSafepoints option is enabled to improve the profile precision
>
>
> On Wed, 5 Mar 2025 at 12:38, Benedict Elliott Smith <bened...@apache.org> 
> wrote:
>
> Some quick thoughts of my own…
>
> === Performance ===
> - I have seen heap dumps with > 1GiB dedicated to metric counters. This patch 
> should improve this, while opening up room to cut it further, steeply.
> - The performance improvement in relative terms for the metrics being 
> replaced is rather dramatic - about 80%.. We can also improve this further.
> - Cheaper metrics (in terms of both cpu and memory) means we can readily have 
> more of them, exposing finer-grained details. This is hard to understate the 
> value of.
>
> === Reporting ===
> - We’re already non-standard for our most important metrics, because we had 
> to replace the Codahale histogram years ago
> - We can continue implementing the Codahale interfaces, so that exporting 
> libraries have minimal work to support us
> - We can probably push patches upstream to a couple of selected libraries we 
> consider important
> - I would anyway also support picking a new reporting framework to support, 
> but I would like us to do this with great care to avoid repeating our 
> mistakes. I won’t have cycles to actually implement this, so it would be down 
> to others to decide if they are willing to undertake this work
>
> I think the fallback option for now, however, is to abuse unsafe to allow us 
> to override the implementation details of Codahale metrics. So we can 
> decouple the performance discussion for now from the deprecation discussion, 
> but I think we should have a target of deprecating Codahale/DropWizard for 
> the reasons Dmitry outlines, however we decide to do it.
>
> On 4 Mar 2025, at 21:17, Jon Haddad <j...@rustyrazorblade.com> wrote:
>
> I've got a few thoughts...
>
> On the performance side, I took a look at a few CPU profiles from past 
> benchmarks and I'm seeing DropWizard taking ~ 3% of CPU time.  Is there a 
> specific workload you're running where you're seeing it take up a significant 
> % of CPU time?  Could you share some metrics, profile data, or a workload so 
> I can try to reproduce your findings?  In my testing I've found the majority 
> of the overhead from metrics to come from JMX, not DropWizard.
>
> On the operator side, inventing our own metrics lib means risks making it 
> harder to instrument Cassandra.  There are libraries out there that allow you 
> to tap into DropWizard metrics directly.  For example, Sarma Pydipally did a 
> presentation on this last year [1] based on some code I threw together.
>
> If you're planning on making it easier to instrument C* by supporting sending 
> metrics to the OTel collector [2], then I could see the change being a net 
> win as long as the perf is no worse than the status quo.
>
> It's hard to know the full extent of what you're planning and the impact, so 
> I'll save any opinions till I know more about the plan.
>
> Thanks for bringing this up!
> Jon
>
> [1] 
> https://planetcassandra.org/leaf/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra-business-platform-team/
> [2] https://opentelemetry.io/docs/collector/
>
> On Tue, Mar 4, 2025 at 12:40 PM Dmitry Konstantinov <netud...@gmail.com> 
> wrote:
>
> Hi all,
>
> After a long conversation with Benedict and Maxim in CASSANDRA-20250 I would 
> like to raise and discuss a proposal to deprecate Dropwizard/Codahale metrics 
> usage in the next major release of Cassandra server and drop it in the 
> following major release.
> Instead of it our own Java API and implementation should be introduced. For 
> the next major release Dropwizard/Codahale API is still planned to support by 
> extending Codahale implementations, to give potential users of this API 
> enough time for transition.
> The proposal does not affect JMX API for metrics, it is only about local Java 
> API changes within Cassandra server classpath, so it is about the cases when 
> somebody outside of Cassandra server code relies on Codahale API in some kind 
> of extensions or agents.
>
> Reasons:
> 1) Codahale metrics implementation is not very efficient from CPU and memory 
> usage point of view. In the past we already replaced default Codahale 
> implementations for Reservoir with our custom one and now in CASSANDRA-20250 
> we (Benedict and I) want to add a more efficient implementation for Counter 
> and Meter logic. So, in total we do not have so much logic left from the 
> original library (mostly a MetricRegistry as container for metrics) and the 
> majority of logic is implemented by ourselves.
> We use metrics a lot along the read and write paths and they contribute a 
> visible overhead (for example for plain write load it is about 9-11% 
> according to async profiler CPU profile), so we want them to be highly 
> optimized.
> From memory perspective Counter and Meter are built based on LongAdder and 
> they are quite heavy for the amounts which we create and use.
>
> 2) Codahale metrics does not provide any way to replace Counter and Meter 
> implementations. There are no full functional interfaces for these entities + 
> MetricRegistry has casts/checks to implementations and cannot work with 
> anything else.
> I looked through the already reported issues and found the following similar 
> and unsuccessful attempt to introduce interfaces for metrics: 
> https://github.com/dropwizard/metrics/issues/2186
> as well as other older attempts:
> https://github.com/dropwizard/metrics/issues/252
> https://github.com/dropwizard/metrics/issues/264
> https://github.com/dropwizard/metrics/issues/703
> https://github.com/dropwizard/metrics/pull/487
> https://github.com/dropwizard/metrics/issues/479
> https://github.com/dropwizard/metrics/issues/253
>
> So, the option to request an extensibility from Codahale metrics does not 
> look real..
>
> 3) It looks like the library is in maintenance mode now, 5.x version is on 
> hold and many integrations are also not so alive.
> The main benefit to use Codahale metrics should be a huge amount of 
> reporters/integrations but if we check carefully the list of reporters 
> mentioned here: 
> https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters
> we can see that almost all of them are dead/archived.
>
> 4) In general, exposing other 3rd party libraries as our own public API 
> frequently creates too many limitations and issues (Guava is another typical 
> example which I saw previously, it is easy to start but later you struggle 
> more and more).
>
> Does anyone have any questions or concerns regarding this suggestion?
> --
> Dmitry Konstantinov
>
>
>
>
> --
> Dmitry Konstantinov
>
>

Reply via email to