mlbiscoc commented on code in PR #3384:
URL: https://github.com/apache/solr/pull/3384#discussion_r2150416427


##########
solr/core/src/java/org/apache/solr/blockcache/Metrics.java:
##########
@@ -54,8 +55,10 @@ public class Metrics extends SolrCacheBase implements 
SolrInfoBean {
   private SolrMetricsContext solrMetricsContext;
   private long previous = System.nanoTime();
 
+  // TODO SOLR-17458: Migrate to Otel

Review Comment:
   Didn't know about `NOCOMMIT` I updated a few places with it and will use 
that instead going forward and keep the rest as TODO for now.



##########
solr/core/src/java/org/apache/solr/metrics/SolrMetricsContext.java:
##########
@@ -116,21 +134,94 @@ public SolrMetricsContext getChildContext(Object child) {
    * Register a metric name that this component reports. This method is called 
by various metric
    * registration methods in {@link org.apache.solr.metrics.SolrMetricManager} 
in order to capture
    * what metric names are reported from this component (which in turn is 
called from {@link
-   * 
org.apache.solr.metrics.SolrMetricProducer#initializeMetrics(SolrMetricsContext,
 String)}).
+   * SolrMetricProducer#initializeMetrics(SolrMetricsContext,
+   * io.opentelemetry.api.common.Attributes, String)}).
    */
+  // TODO We can continue to register metric names
   public void registerMetricName(String name) {
     metricNames.add(name);
   }
 
   /** Return a snapshot of metric values that this component reports. */
+  // TODO Don't think this is needed anymore
   public Map<String, Object> getMetricsSnapshot() {
     return MetricUtils.convertMetrics(getMetricRegistry(), metricNames);
   }
 
+  public LongCounter longCounter(String metricName, String description) {

Review Comment:
   I'm debating if SolrMetricsContext isn't necessarily needed so I'll either 
delete it entirely or rename it to just `SolrMetricsScope` to pin instrument 
creation to the scope it wants to create to metricManager. I didn't determine 
that entirely in this PR yet.
   
   In this PR, I should have mentioned it as well but the concept of 
`registryName` just changes to `otel scope` which is why you see when we pass 
`registryName` into metric manager. Also if you look at the metrics, there is a 
tag that OTEL created itself called `otel_scope_name` which was the registry or 
"scope" that this metric was created in. So we still keep the `solr.node` 
`solr.core.*` registries unless we want some much more granular.
   
   ```
   solr_metrics_core_requests_times_milliseconds_bucket{category="QUERY", ... 
otel_scope_name="solr.core.foobar.shard1.replica_n1", ...} 1
   
   ```



##########
solr/core/src/test/org/apache/solr/handler/admin/MetricsHandlerTest.java:
##########
@@ -706,459 +693,6 @@ public void testExprMetrics() throws Exception {
     assertTrue(map.toString(), map.containsKey("count"));
   }
 
-  @Test
-  @SuppressWarnings("unchecked")
-  public void testPrometheusMetricsCore() throws Exception {

Review Comment:
   Many of these Prometheus Tests are relating to the PrometheusFormatter 
features that was removed. A version of this should come back eventually 
further into the migration but with metric naming and attributes may change, I 
didn't see a reason to currently just fix it if naming is subject to change.



##########
solr/core/src/java/org/apache/solr/handler/RequestHandlerBase.java:
##########
@@ -193,14 +203,79 @@ public static class HandlerMetrics {
     public final Timer requestTimes;
     public final Counter totalTime;
 
-    public HandlerMetrics(SolrMetricsContext solrMetricsContext, String... 
metricPath) {
+    public AttributedLongCounter otelRequests;
+    public AttributedLongCounter otelNumErrors;
+    public AttributedLongCounter otelNumServerErrors;
+    public AttributedLongCounter otelNumClientErrors;
+    public AttributedLongCounter otelNumTimeouts;
+    public AttributedLongTimer otelRequestTimes;
+
+    public HandlerMetrics(
+        SolrMetricsContext solrMetricsContext, Attributes attributes, 
String... metricPath) {
+
+      // TODO SOLR-17458: To be removed
       numErrors = solrMetricsContext.meter("errors", metricPath);
       numServerErrors = solrMetricsContext.meter("serverErrors", metricPath);
       numClientErrors = solrMetricsContext.meter("clientErrors", metricPath);
       numTimeouts = solrMetricsContext.meter("timeouts", metricPath);
       requests = solrMetricsContext.counter("requests", metricPath);
       requestTimes = solrMetricsContext.timer("requestTimes", metricPath);
       totalTime = solrMetricsContext.counter("totalTime", metricPath);
+
+      var baseRequestMetric =
+          solrMetricsContext.longCounter("solr_metrics_core_requests", "HTTP 
Solr request counts");
+
+      var baseRequestTimeMetric =
+          solrMetricsContext.longHistogram(
+              "solr_metrics_core_requests_times", "HTTP Solr request times", 
"ms");
+
+      otelRequests =
+          new AttributedLongCounter(
+              baseRequestMetric,
+              Attributes.builder()
+                  .putAll(attributes)
+                  .put(AttributeKey.stringKey("type"), "requests")
+                  .build());
+
+      otelNumErrors =

Review Comment:
   This is a good point actually and updating it with a separate type instead 
of inside the value didn't really make sense. Should look like this now:
   
   ```
   
solr_metrics_core_requests_total{category="QUERY",collection="foobar",core="foobar_shard1_replica_n1",handler="/select",internal="false",otel_scope_name="solr.core.foobar.shard1.replica_n1",replica="replica_n1",shard="shard1",source="client",type="errors"}
 0.0
   
solr_metrics_core_requests_total{category="QUERY",collection="foobar",core="foobar_shard1_replica_n1",handler="/select",internal="false",otel_scope_name="solr.core.foobar.shard1.replica_n1",replica="replica_n1",shard="shard1",source="server",type="errors"}
 0.0
   ```



##########
solr/core/src/java/org/apache/solr/util/stats/MetricUtils.java:
##########
@@ -858,4 +861,20 @@ public static <T extends PlatformManagedObject> void 
addMXBeanMetrics(
   public static MeterProvider getMeterProvider() {
     return GlobalOpenTelemetry.getMeterProvider();
   }
+
+  public static Attributes createAttributes(String... attributes) {

Review Comment:
   Removed



##########
solr/core/src/test/org/apache/solr/handler/RequestHandlerBaseTest.java:
##########
@@ -89,9 +116,16 @@ public void 
testEach4xxSolrExceptionIncrementsTheClientErrorCount() {
 
     RequestHandlerBase.processErrorMetricsOnException(e, metrics);
 
-    verify(metrics.numErrors).mark();
-    verify(metrics.numClientErrors).mark();
-    verifyNoInteractions(metrics.numServerErrors);
+    verify(mockLongCounter, times(1))
+        .add(eq(1L), argThat(attrs -> 
"errors".equals(attrs.get(AttributeKey.stringKey("type")))));

Review Comment:
   I created a constant in `SolrMetricProducer`



##########
solr/core/src/test/org/apache/solr/metrics/SolrCoreMetricTest.java:
##########
@@ -1,82 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.solr.metrics;
-
-import com.codahale.metrics.Metric;
-import io.prometheus.metrics.model.snapshots.Labels;
-import org.apache.solr.SolrTestCaseJ4;
-import org.apache.solr.metrics.prometheus.SolrPrometheusFormatter;
-import org.apache.solr.metrics.prometheus.core.SolrCoreMetric;
-import org.junit.Test;
-
-public class SolrCoreMetricTest extends SolrTestCaseJ4 {

Review Comment:
   Added in comments on many of the tests. All these `Solr*MetricTest` are 
specific unit tests around these wrappers that was used to parse the dropwizard 
output for labels. Don't need any of this anymore since it comes from OTEL.



##########
solr/core/src/java/org/apache/solr/metrics/SolrMetricProducer.java:
##########
@@ -40,20 +41,16 @@ static String getUniqueMetricTag(Object o, String 
parentName) {
   }
 
   /**
-   * Initialize metrics specific to this producer.
-   *
-   * @param parentContext parent metrics context. If this component has the 
same life-cycle as the
-   *     parent it can simply use the parent context, otherwise it should 
obtain a child context
-   *     using {@link SolrMetricsContext#getChildContext(Object)} passing 
<code>this</code> as the
-   *     child object.
-   * @param scope component scope
+   * Deprecated entry point for initializing metrics. TODO SOLR-17458: This 
will be removed Change
+   * to only take attributes completely removing Dropwizard
    */
-  void initializeMetrics(SolrMetricsContext parentContext, String scope);
+  void initializeMetrics(SolrMetricsContext parentContext, Attributes 
attributes, String scope);

Review Comment:
   So before with Dropwizard since there was no attributes, 
`SolrMetricsContext` and `scope` were used together to determine the scope of 
where metrics should be inside of it's JSON. `scope` string for 
`RequestHandlerBase` was the handler such as `/select` but doesn't necessarily 
need to be a handler.
   
   The idea now is that we won't need a `scope` string since whatever this 
scope string was can be inside of Attributes being passed in itself and more. 
So `Attributes` passed in here will be the base attributes for whatever is 
implementing `SolrMetricProducer`. For `RequestHandlerBase`, the attributes I 
felt were important to pass in as a base for every metric being recorded there 
was core descriptors like `coreName` `collection` etc and is then built from 
there with things like `type=errors`



##########
solr/core/src/java/org/apache/solr/metrics/otel/OtelDoubleMetric.java:
##########
@@ -14,9 +14,9 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+package org.apache.solr.metrics.otel;
 
-/**
- * The {@link 
org.apache.solr.metrics.prometheus.jvm.SolrPrometheusJvmFormatter} is 
responsible for
- * exporting solr.jvm registry metrics to Prometheus.
- */
-package org.apache.solr.metrics.prometheus.jvm;
+@FunctionalInterface
+public interface OtelDoubleMetric {

Review Comment:
   Not really actually. I just removed it.



##########
solr/core/src/java/org/apache/solr/metrics/otel/instruments/AttributedLongTimer.java:
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.metrics.otel.instruments;
+
+import io.opentelemetry.api.common.Attributes;
+import io.opentelemetry.api.metrics.LongHistogram;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class records the elapsed time (in milliseconds) to a {@link 
LongHistogram}. Each thread
+ * measures its own timer allowing concurrent timing operations on separate 
threads..
+ */
+public class AttributedLongTimer extends AttributedLongHistogram {
+
+  private final ThreadLocal<Long> startTimeNanos = new ThreadLocal<>();

Review Comment:
   I implemented a `MetricTimer` that extends the `RTimer`. This might work a 
bit nicer and works similar to how people might have used the Dropwizard Timer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to