[ https://issues.apache.org/jira/browse/KAFKA-19341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17954530#comment-17954530 ]
Patrik Kleindl commented on KAFKA-19341: ---------------------------------------- Update: This issue was fixed by a minor PR in Trunk (https://github.com/apache/kafka/pull/18674) and is part of 4.0.0 I added a unit test locally to verify this and noticed that the current tests don't cover the values used in the code and would fail with them {code:java} @Test public void testRecordLimitWithLongerHighestTrackableValue() { long highestTrackableValue = Duration.ofMinutes(1).toMillis(); HdrHistogram hdrHistogram = new HdrHistogram(10L, highestTrackableValue, 3); hdrHistogram.record(highestTrackableValue + 1); assertEquals(highestTrackableValue, hdrHistogram.max(System.currentTimeMillis())); }{code} This fails until numberOfSignificantValueDigits (last parameter) is increased from 3 to 5. The related code where this is set up is {code:java} public static KafkaMetricHistogram newLatencyHistogram( Function<String, MetricName> metricNameFactory ) { return new KafkaMetricHistogram( metricNameFactory, MAX_LATENCY_MS, NUM_SIG_FIGS); }{code} [~jeffkbkim] Linking you here as you did the implementation and the fix for the exception. > Execution of HighWatermarkUpdate failed > --------------------------------------- > > Key: KAFKA-19341 > URL: https://issues.apache.org/jira/browse/KAFKA-19341 > Project: Kafka > Issue Type: Bug > Components: group-coordinator > Affects Versions: 4.0.0 > Reporter: Patrik Kleindl > Priority: Major > > We got the following Exception multiple times in our logs when a client > showed problems with the group coordinator: > {code:java} > [ERROR] 2025-05-27 02:18:51,623 [group-coordinator-event-processor-0] > org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime complete - > [GroupCoordinator id=2] Execution of HighWatermarkUpdate failed due to value > 45050145 outside of histogram covered range. Caused by: > java.lang.ArrayIndexOutOfBoundsException: Index 16734 out of bounds for > length 7168. > java.lang.ArrayIndexOutOfBoundsException: value 45050145 outside of histogram > covered range. Caused by: java.lang.ArrayIndexOutOfBoundsException: Index > 16734 out of bounds for length 7168 > at > org.HdrHistogram.AbstractHistogram.handleRecordException(AbstractHistogram.java:571) > at > org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:563) > at > org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467) > at org.HdrHistogram.Recorder.recordValue(Recorder.java:136) > at > org.apache.kafka.coordinator.group.metrics.HdrHistogram.record(HdrHistogram.java:98) > at > org.apache.kafka.coordinator.group.metrics.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128) > at org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:237) > at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:198) > at > org.apache.kafka.coordinator.group.metrics.GroupCoordinatorRuntimeMetrics.recordEventPurgatoryTime(GroupCoordinatorRuntimeMetrics.java:301) > at > org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.complete(CoordinatorRuntime.java:1362) > at > org.apache.kafka.deferred.DeferredEventQueue.completeUpTo(DeferredEventQueue.java:63) > at > org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$HighWatermarkListener.lambda$onHighWatermarkUpdated$0(CoordinatorRuntime.java:1802) > at > org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorInternalEvent.run(CoordinatorRuntime.java:1723) > at > org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:148) > at > org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:180){code} > We are running Confluent Platform 7.9 which should be based on Apache Kafka > 3.9, but this Exception should only be present in Kafka 4.0 from > https://issues.apache.org/jira/browse/KAFKA-16379 > I will create a ticket with Confluent, but as this code is part of Apache > Kafka itself it could probably affect others too. > If I understand the exception the HighWatermarkUpdate operation itself was > successful but the problem is caused by writing the metrics. > After a restart of the cluster and the client the problem was resolved, but > it didn't show up right after the last update or changes. -- This message was sent by Atlassian Jira (v8.20.10#820010)