mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2150416427
##########
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##########
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements
SolrInfoBean {
private SolrMetricsContext solrMetricsContext;
private long previous = System.nanoTime();
+ // TODO SOLR-17458: Migrate to Otel
Review Comment:
Didn't know about `NOCOMMIT` I updated a few places with it and will use
that instead going forward and keep the rest as TODO for now.
##########
solr/core/src/java/org/apache/solr/metrics/SolrMetricsContext.java:
##########
@@ -116,21 +134,94 @@ public SolrMetricsContext getChildContext(Object child) {
* Register a metric name that this component reports. This method is called
by various metric
* registration methods in {@link org.apache.solr.metrics.SolrMetricManager}
in order to capture
* what metric names are reported from this component (which in turn is
called from {@link
- *
org.apache.solr.metrics.SolrMetricProducer#initializeMetrics(SolrMetricsContext,
String)}).
+ * SolrMetricProducer#initializeMetrics(SolrMetricsContext,
+ * io.opentelemetry.api.common.Attributes, String)}).
*/
+ // TODO We can continue to register metric names
public void registerMetricName(String name) {
metricNames.add(name);
}
/** Return a snapshot of metric values that this component reports. */
+ // TODO Don't think this is needed anymore
public Map<String, Object> getMetricsSnapshot() {
return MetricUtils.convertMetrics(getMetricRegistry(), metricNames);
}
+ public LongCounter longCounter(String metricName, String description) {
Review Comment:
I'm debating if SolrMetricsContext isn't necessarily needed so I'll either
delete it entirely or rename it to just `SolrMetricsScope` to pin instrument
creation to the scope it wants to create to metricManager. I didn't determine
that entirely in this PR yet.
In this PR, I should have mentioned it as well but the concept of
`registryName` just changes to `otel scope` which is why you see when we pass
`registryName` into metric manager. Also if you look at the metrics, there is a
tag that OTEL created itself called `otel_scope_name` which was the registry or
"scope" that this metric was created in. So we still keep the `solr.node`
`solr.core.*` registries unless we want some much more granular.
```
solr_metrics_core_requests_times_milliseconds_bucket{category="QUERY", ...
otel_scope_name="solr.core.foobar.shard1.replica_n1", ...} 1
```
##########
solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java:
##########
@@ -706,459 +693,6 @@ public void testExprMetrics() throws Exception {
assertTrue(map.toString(), map.containsKey("count"));
}
- @Test
- @SuppressWarnings("unchecked")
- public void testPrometheusMetricsCore() throws Exception {
Review Comment:
Many of these Prometheus Tests are relating to the PrometheusFormatter
features that was removed. A version of this should come back eventually
further into the migration but with metric naming and attributes may change, I
didn't see a reason to currently just fix it if naming is subject to change.
##########
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##########
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
public final Timer requestTimes;
public final Counter totalTime;
- public HandlerMetrics(SolrMetricsContext solrMetricsContext, String...
metricPath) {
+ public AttributedLongCounter otelRequests;
+ public AttributedLongCounter otelNumErrors;
+ public AttributedLongCounter otelNumServerErrors;
+ public AttributedLongCounter otelNumClientErrors;
+ public AttributedLongCounter otelNumTimeouts;
+ public AttributedLongTimer otelRequestTimes;
+
+ public HandlerMetrics(
+ SolrMetricsContext solrMetricsContext, Attributes attributes,
String... metricPath) {
+
+ // TODO SOLR-17458: To be removed
numErrors = solrMetricsContext.meter("errors", metricPath);
numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
requests = solrMetricsContext.counter("requests", metricPath);
requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+ var baseRequestMetric =
+ solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP
Solr request counts");
+
+ var baseRequestTimeMetric =
+ solrMetricsContext.longHistogram(
+ "solr_metrics_core_requests_times", "HTTP Solr request times",
"ms");
+
+ otelRequests =
+ new AttributedLongCounter(
+ baseRequestMetric,
+ Attributes.builder()
+ .putAll(attributes)
+ .put(AttributeKey.stringKey("type"), "requests")
+ .build());
+
+ otelNumErrors =
Review Comment:
This is a good point actually and updating it with a separate type instead
of inside the value didn't really make sense. Should look like this now:
```
solr_metrics_core_requests_total{category="QUERY",collection="foobar",core="foobar_shard1_replica_n1",handler="/select",internal="false",otel_scope_name="solr.core.foobar.shard1.replica_n1",replica="replica_n1",shard="shard1",source="client",type="errors"}
0.0
solr_metrics_core_requests_total{category="QUERY",collection="foobar",core="foobar_shard1_replica_n1",handler="/select",internal="false",otel_scope_name="solr.core.foobar.shard1.replica_n1",replica="replica_n1",shard="shard1",source="server",type="errors"}
0.0
```
##########
solr/core/src/java/org/apache/solr/util/stats/MetricUtils.java:
##########
@@ -858,4 +861,20 @@ public static <T extends PlatformManagedObject> void
addMXBeanMetrics(
public static MeterProvider getMeterProvider() {
return GlobalOpenTelemetry.getMeterProvider();
}
+
+ public static Attributes createAttributes(String... attributes) {
Review Comment:
Removed
##########
solr/core/src/test/org/apache/solr/handler/RequestHandlerBaseTest.java:
##########
@@ -89,9 +116,16 @@ public void
testEach4xxSolrExceptionIncrementsTheClientErrorCount() {
RequestHandlerBase.processErrorMetricsOnException(e, metrics);
- verify(metrics.numErrors).mark();
- verify(metrics.numClientErrors).mark();
- verifyNoInteractions(metrics.numServerErrors);
+ verify(mockLongCounter, times(1))
+ .add(eq(1L), argThat(attrs ->
"errors".equals(attrs.get(AttributeKey.stringKey("type")))));
Review Comment:
I created a constant in `SolrMetricProducer`
##########
solr/core/src/test/org/apache/solr/metrics/SolrCoreMetricTest.java:
##########
@@ -1,82 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements. See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License. You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.solr.metrics;
-
-import com.codahale.metrics.Metric;
-import io.prometheus.metrics.model.snapshots.Labels;
-import org.apache.solr.SolrTestCaseJ4;
-import org.apache.solr.metrics.prometheus.SolrPrometheusFormatter;
-import org.apache.solr.metrics.prometheus.core.SolrCoreMetric;
-import org.junit.Test;
-
-public class SolrCoreMetricTest extends SolrTestCaseJ4 {
Review Comment:
Added in comments on many of the tests. All these `Solr*MetricTest` are
specific unit tests around these wrappers that was used to parse the dropwizard
output for labels. Don't need any of this anymore since it comes from OTEL.
##########
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##########
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String
parentName) {
}
/**
- * Initialize metrics specific to this producer.
- *
- * @param parentContext parent metrics context. If this component has the
same life-cycle as the
- * parent it can simply use the parent context, otherwise it should
obtain a child context
- * using {@link SolrMetricsContext#getChildContext(Object)} passing
<code>this</code> as the
- * child object.
- * @param scope component scope
+ * Deprecated entry point for initializing metrics. TODO SOLR-17458: This
will be removed Change
+ * to only take attributes completely removing Dropwizard
*/
- void initializeMetrics(SolrMetricsContext parentContext, String scope);
+ void initializeMetrics(SolrMetricsContext parentContext, Attributes
attributes, String scope);
Review Comment:
So before with Dropwizard since there was no attributes,
`SolrMetricsContext` and `scope` were used together to determine the scope of
where metrics should be inside of it's JSON. `scope` string for
`RequestHandlerBase` was the handler such as `/select` but doesn't necessarily
need to be a handler.
The idea now is that we won't need a `scope` string since whatever this
scope string was can be inside of Attributes being passed in itself and more.
So `Attributes` passed in here will be the base attributes for whatever is
implementing `SolrMetricProducer`. For `RequestHandlerBase`, the attributes I
felt were important to pass in as a base for every metric being recorded there
was core descriptors like `coreName` `collection` etc and is then built from
there with things like `type=errors`
##########
solr/core/src/java/org/apache/solr/metrics/otel/OtelDoubleMetric.java:
##########
@@ -14,9 +14,9 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+package org.apache.solr.metrics.otel;
-/**
- * The {@link
org.apache.solr.metrics.prometheus.jvm.SolrPrometheusJvmFormatter} is
responsible for
- * exporting solr.jvm registry metrics to Prometheus.
- */
-package org.apache.solr.metrics.prometheus.jvm;
+@FunctionalInterface
+public interface OtelDoubleMetric {
Review Comment:
Not really actually. I just removed it.
##########
solr/core/src/java/org/apache/solr/metrics/otel/instruments/AttributedLongTimer.java:
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.metrics.otel.instruments;
+
+import io.opentelemetry.api.common.Attributes;
+import io.opentelemetry.api.metrics.LongHistogram;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class records the elapsed time (in milliseconds) to a {@link
LongHistogram}. Each thread
+ * measures its own timer allowing concurrent timing operations on separate
threads..
+ */
+public class AttributedLongTimer extends AttributedLongHistogram {
+
+ private final ThreadLocal<Long> startTimeNanos = new ThreadLocal<>();
Review Comment:
I implemented a `MetricTimer` that extends the `RTimer`. This might work a
bit nicer and works similar to how people might have used the Dropwizard Timer
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]