Interesting. Looks like disconnection resulted in the stack overflow. I think the following would fix the overflow:
https://pastebin.com/Pm5g5V2L On Thu, Dec 14, 2017 at 7:40 AM, Jörg Heinicke <joerg.heini...@gmx.de> wrote: > > Hi everyone, > > We recently switched to Kafka 1.0 and are facing an issue which we have > not noticed with version 0.10.x before. > > One of our consumer group falls into permanent rebalancing cycle. On > analysing the log files we noticed a StackOverflowError in > kafka-coordinator-heartbeat-thread (see partial stack trace below, > overall it's over 1,000 lines). Immediately before the error there are > hundreds, if not thousands of log entries of following type: > > 2017-12-12 16:23:12.361 [kafka-coordinator-heartbeat-thread | > my-consumer-group] INFO - [Consumer clientId=consumer-4, > groupId=my-consumer-group] Marking the coordinator <IP>:<Port> (id: > 2147483645 rack: null) dead > > The stack traces are always somewhere in the DateFormat code, even though > at different lines. > > Is that a purely Kafka-internal thing (not to say a bug) or can we somehow > influence the occurrence of the error, e.g. is some configuration potential > affecting it? (Avoiding the connectivity issue to the group coordinator is > an obvious thing to do, but I rather have in mind keeping the heartbeat > thread alive despite the connectivity issue.) > > Thanks in advance & regards, > Joerg > > 2017-12-12 16:23:05.884 [kafka-coordinator-heartbeat-thread | > my-consumer-group] ERROR - Uncaught exception in thread > 'kafka-coordinator-heartbeat-thread | my-consumer-group': > java.lang.StackOverflowError > at java.text.DateFormatSymbols.getProviderInstance(DateFormatSy > mbols.java:362) > at java.text.DateFormatSymbols.getInstance(DateFormatSymbols.ja > va:340) > at java.util.Calendar.getDisplayName(Calendar.java:2110) > at java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java: > 1125) > at java.text.SimpleDateFormat.format(SimpleDateFormat.java:966) > at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936) > at java.text.DateFormat.format(DateFormat.java:345) > at org.apache.log4j.helpers.PatternParser$DatePatternConverter. > convert(PatternParser.java:443) > at org.apache.log4j.helpers.PatternConverter.format(PatternConv > erter.java:65) > at org.apache.log4j.PatternLayout.format(PatternLayout.java:506) > at org.apache.log4j.WriterAppender.subAppend(WriterAppender. > java:310) > at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) > at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton. > java:251) > at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOn > Appenders(AppenderAttachableImpl.java:66) > at org.apache.log4j.Category.callAppenders(Category.java:206) > at org.apache.log4j.Category.forcedLog(Category.java:391) > at org.apache.log4j.Category.log(Category.java:856) > at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter. > java:324) > at org.apache.kafka.common.utils.LogContext$KafkaLogger.info(Lo > gContext.java:341) > at org.apache.kafka.clients.consumer.internals.AbstractCoordina > tor.coordinatorDead(AbstractCoordinator.java:649) > at org.apache.kafka.clients.consumer.internals.AbstractCoordina > tor$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797) > at org.apache.kafka.clients.consumer.internals.RequestFuture$1. > onFailure(RequestFuture.java:209) > at org.apache.kafka.clients.consumer.internals.RequestFuture. > fireFailure(RequestFuture.java:177) > at org.apache.kafka.clients.consumer.internals.RequestFuture. > raise(RequestFuture.java:147) > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient$RequestFutureCompletionHandler.fireCompletion(Consumer > NetworkClient.java:496) > ... > the following 9 lines are repeated around hundred times. > ... > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient$RequestFutureCompletionHandler.fireCompletion(Consumer > NetworkClient.java:496) > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient.firePendingCompletedRequests(ConsumerNetworkClient.java:353) > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient.failUnsentRequests(ConsumerNetworkClient.java:416) > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient.disconnect(ConsumerNetworkClient.java:388) > at org.apache.kafka.clients.consumer.internals.AbstractCoordina > tor.coordinatorDead(AbstractCoordinator.java:653) > at org.apache.kafka.clients.consumer.internals.AbstractCoordina > tor$CoordinatorResponseHandler.onFailure(AbstractCoordinator.java:797) > at org.apache.kafka.clients.consumer.internals.RequestFuture$1. > onFailure(RequestFuture.java:209) > at org.apache.kafka.clients.consumer.internals.RequestFuture. > fireFailure(RequestFuture.java:177) > at org.apache.kafka.clients.consumer.internals.RequestFuture. > raise(RequestFuture.java:147) > at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC > lient$RequestFutureCompletionHandler.fireCompletion(Consumer > NetworkClient.java:496) >