[ https://issues.apache.org/jira/browse/CASSANDRA-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitry Konstantinov updated CASSANDRA-20332: -------------------------------------------- Change Category: Performance Complexity: Normal Status: Open (was: Triage Needed) > Provide the ability to disable specific metrics collection > ---------------------------------------------------------- > > Key: CASSANDRA-20332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20332 > Project: Apache Cassandra > Issue Type: New Feature > Components: Observability/Metrics > Reporter: Dmitry Konstantinov > Assignee: Dmitry Konstantinov > Priority: Normal > > Cassandra has a lot of metrics collected, many of them are collected per > table, so their instance number is multiplied by number of tables. From one > side it gives a better observability, from another side metrics are not for > free, there is an overhead associated with them: > 1) CPU overhead: in case of simple CPU bound load: I already see like 5.5% of > total CPU spent for metrics in cpu framegraphs for read load and 11% for > write load. > Example: [^cpu_profile_insert.html] (search by "codahale" pattern). The > framegraph is captured using Async profiler build: > async-profiler-3.0-29ee888-linux-x64 > 2) memory overhead: we spend memory for entities used to aggregate metrics > such as LongAdders and reservoirs + for MBeans (String concatenation within > object names is a major cause of it, for each table+metric name combination a > new String is created) > CASSANDRA-20250 should optimize CPU and memory overhead but still ideally we > should not pay for the metrics which we do not need. > The idea of this ticket is to allow an operator to configure a list of > disabled metrics in cassandra.yaml, like: > {code:java} > disabled_metrics: > - metric_a > - metric_b > {code} > From implementation point of view I see two possible approaches (which can be > combined): > # Generic: when a metric is registering if it is listed in disabled_metrics > we do not publish it via JMX and provide a noop implementation of metric > object (such as histogram) for it. > Logging analogy: log level check within log method > # Specialized: for some metrics the process of value calculation is not for > free and introduces an overhead as well, in such cases it would be useful to > check within specific logic using an API (like: isMetricEnabled) do we need > to do it. Example of such metric: > ClientRequestSizeMetrics.recordRowAndColumnCountMetrics > Logging analogy: an explicit 'if (isDebugEnabled())' condition used when a > message parameter is expensive. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org