[ 
https://issues.apache.org/jira/browse/KAFKA-19341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17954530#comment-17954530
 ] 

Patrik Kleindl commented on KAFKA-19341:
----------------------------------------

Update: This issue was fixed by a minor PR in Trunk 
(https://github.com/apache/kafka/pull/18674) and is part of 4.0.0

I added a unit test locally to verify this and noticed that the current tests 
don't cover the values used in the code and would fail with them
{code:java}
@Test
public void testRecordLimitWithLongerHighestTrackableValue() {
    long highestTrackableValue = Duration.ofMinutes(1).toMillis();
    HdrHistogram hdrHistogram = new HdrHistogram(10L, highestTrackableValue, 3);

    hdrHistogram.record(highestTrackableValue + 1);
    assertEquals(highestTrackableValue, 
hdrHistogram.max(System.currentTimeMillis()));
}{code}
This fails until numberOfSignificantValueDigits (last parameter) is increased 
from 3 to 5.

The related code where this is set up is
{code:java}
public static KafkaMetricHistogram newLatencyHistogram(
    Function<String, MetricName> metricNameFactory
) {
    return new KafkaMetricHistogram(
        metricNameFactory,
        MAX_LATENCY_MS,
        NUM_SIG_FIGS);
}{code}
[~jeffkbkim] Linking you here as you did the implementation and the fix for the 
exception.

> Execution of HighWatermarkUpdate failed
> ---------------------------------------
>
>                 Key: KAFKA-19341
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19341
>             Project: Kafka
>          Issue Type: Bug
>          Components: group-coordinator
>    Affects Versions: 4.0.0
>            Reporter: Patrik Kleindl
>            Priority: Major
>
> We got the following Exception multiple times in our logs when a client 
> showed problems with the group coordinator:
> {code:java}
> [ERROR] 2025-05-27 02:18:51,623 [group-coordinator-event-processor-0] 
> org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime complete - 
> [GroupCoordinator id=2] Execution of HighWatermarkUpdate failed due to value 
> 45050145 outside of histogram covered range. Caused by: 
> java.lang.ArrayIndexOutOfBoundsException: Index 16734 out of bounds for 
> length 7168.
> java.lang.ArrayIndexOutOfBoundsException: value 45050145 outside of histogram 
> covered range. Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 
> 16734 out of bounds for length 7168
>     at 
> org.HdrHistogram.AbstractHistogram.handleRecordException(AbstractHistogram.java:571)
>     at 
> org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:563)
>     at 
> org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467)
>     at org.HdrHistogram.Recorder.recordValue(Recorder.java:136)
>     at 
> org.apache.kafka.coordinator.group.metrics.HdrHistogram.record(HdrHistogram.java:98)
>     at 
> org.apache.kafka.coordinator.group.metrics.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128)
>     at org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:237)
>     at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:198)
>     at 
> org.apache.kafka.coordinator.group.metrics.GroupCoordinatorRuntimeMetrics.recordEventPurgatoryTime(GroupCoordinatorRuntimeMetrics.java:301)
>     at 
> org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorWriteEvent.complete(CoordinatorRuntime.java:1362)
>     at 
> org.apache.kafka.deferred.DeferredEventQueue.completeUpTo(DeferredEventQueue.java:63)
>     at 
> org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$HighWatermarkListener.lambda$onHighWatermarkUpdated$0(CoordinatorRuntime.java:1802)
>     at 
> org.apache.kafka.coordinator.group.runtime.CoordinatorRuntime$CoordinatorInternalEvent.run(CoordinatorRuntime.java:1723)
>     at 
> org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:148)
>     at 
> org.apache.kafka.coordinator.group.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:180){code}
> We are running Confluent Platform 7.9 which should be based on Apache Kafka 
> 3.9, but this Exception should only be present in Kafka 4.0 from 
> https://issues.apache.org/jira/browse/KAFKA-16379
> I will create a ticket with Confluent, but as this code is part of Apache 
> Kafka itself it could probably affect others too.
> If I understand the exception the HighWatermarkUpdate operation itself was 
> successful but the problem is caused by writing the metrics.
> After a restart of the cluster and the client the problem was resolved, but 
> it didn't show up right after the last update or changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to