Hi everyone,

We recently switched to Kafka 1.0 and are facing an issue which we have not noticed with version 0.10.x before.

One of our consumer group falls into permanent rebalancing cycle. On analysing the log files we noticed a StackOverflowError in kafka-coordinator-heartbeat-thread (see partial stack trace below, overall it's over 1,000 lines). Immediately before the error there are hundreds, if not thousands of log entries of following type:

2017-12-12 16:23:12.361 [kafka-coordinator-heartbeat-thread | my-consumer-group] INFO - [Consumer clientId=consumer-4, groupId=my-consumer-group] Marking the coordinator <IP>:<Port> (id: 2147483645 rack: null) dead

The stack traces are always somewhere in the DateFormat code, even though at different lines.

Is that a purely Kafka-internal thing (not to say a bug) or can we somehow influence the occurrence of the error, e.g. is some configuration potential affecting it? (Avoiding the connectivity issue to the group coordinator is an obvious thing to do, but I rather have in mind keeping the heartbeat thread alive despite the connectivity issue.)

Thanks in advance & regards,
Joerg

2017-12-12 16:23:05.884 [kafka-coordinator-heartbeat-thread | my-consumer-group] ERROR - Uncaught exception in thread 'kafka-coordinator-heartbeat-thread | my-consumer-group':
java.lang.StackOverflowError
at java.text.DateFormatSymbols.getProviderInstance(DateFormatSymbols.java:362) at java.text.DateFormatSymbols.getInstance(DateFormatSymbols.java:340)
        at java.util.Calendar.getDisplayName(Calendar.java:2110)
        at java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:1125)
        at java.text.SimpleDateFormat.format(SimpleDateFormat.java:966)
        at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936)
        at java.text.DateFormat.format(DateFormat.java:345)
at org.apache.log4j.helpers.PatternParser$DatePatternConverter.convert(PatternParser.java:443) at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
        at org.apache.log4j.PatternLayout.format(PatternLayout.java:506)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:324) at org.apache.kafka.common.utils.LogContext$KafkaLogger.info(LogContext.java:341) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead(AbstractCoordinator.java:649) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177) at org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496)
...
the following 9 lines are repeated around hundred times.
...
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.failUnsentRequests(ConsumerNetworkClient.java:416) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.disconnect(ConsumerNetworkClient.java:388) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead(AbstractCoordinator.java:653) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177) at org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:496)

Reply via email to