Alexander Ivanichev created KAFKA-6631:
------------------------------------------
Summary: Kafka Streams - Rebalancing exception in Kafka 1.0.0
Key: KAFKA-6631
URL: https://issues.apache.org/jira/browse/KAFKA-6631
Project: Kafka
Issue Type: Bug
Components: streams
Affects Versions: 1.0.0
Environment: Container Linux by CoreOS 1576.5.0
Reporter: Alexander Ivanichev
In Kafka Streams 1.0.0, we saw a strange rebalance error, our stream app
performs window based aggregations, sometimes on start when all stream workers
join the app just crash, however if we enable only one worker than it works
fine, sometime 2 workers work just fine, but when third join the app crashes
again, some critical issue with rebalance.
{code:java}
018-03-08T18:51:01.226243000Z org.apache.kafka.common.KafkaException:
Unexpected error from SyncGroup: The server experienced an unexpected error
when processing the request
2018-03-08T18:51:01.226557000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566)
2018-03-08T18:51:01.226860000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539)
2018-03-08T18:51:01.227328000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808)
2018-03-08T18:51:01.227630000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788)
2018-03-08T18:51:01.228152000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
2018-03-08T18:51:01.228449000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
2018-03-08T18:51:01.228897000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
2018-03-08T18:51:01.229196000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506)
2018-03-08T18:51:01.229673000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
2018-03-08T18:51:01.229971000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268)
2018-03-08T18:51:01.230436000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214)
2018-03-08T18:51:01.230749000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174)
2018-03-08T18:51:01.231065000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364)
2018-03-08T18:51:01.231584000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
2018-03-08T18:51:01.231911000Z at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
2018-03-08T18:51:01.232190000Z at
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138)
2018-03-08T18:51:01.232643000Z at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
2018-03-08T18:51:01.233121000Z at
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
2018-03-08T18:51:01.233409000Z at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
2018-03-08T18:51:01.233720000Z at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
2018-03-08T18:51:01.234196000Z at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
2018-03-08T18:51:01.234655000Z org.apache.kafka.common.KafkaException:
Unexpected error from SyncGroup: The server experienced an unexpected error
when processing the request
2018-03-08T18:51:01.234972000Z exception in thread, closing process
2018-03-08T18:51:01.235500000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:566)
2018-03-08T18:51:01.235839000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:539)
2018-03-08T18:51:01.236336000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:808)
2018-03-08T18:51:01.236603000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:788)
2018-03-08T18:51:01.236889000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
2018-03-08T18:51:01.237092000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
2018-03-08T18:51:01.237531000Z at
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
2018-03-08T18:51:01.237816000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:506)
2018-03-08T18:51:01.238097000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:353)
2018-03-08T18:51:01.238395000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:268)
2018-03-08T18:51:01.238698000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:214)
2018-03-08T18:51:01.239511000Z exception in thread, closing process
2018-03-08T18:51:01.239880000Z exception in thread, closing process
2018-03-08T18:51:01.240175000Z at
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:174)
2018-03-08T18:51:01.240443000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:364)
2018-03-08T18:51:01.240764000Z at
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
2018-03-08T18:51:01.241083000Z at
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
2018-03-08T18:51:01.241367000Z at
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1138)
2018-03-08T18:51:01.241789000Z at
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
2018-03-08T18:51:01.242075000Z at
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
2018-03-08T18:51:01.242351000Z at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
2018-03-08T18:51:01.242641000Z at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
2018-03-08T18:51:01.243051000Z at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
{code}
On Taking a look further on brokers, I saw another exception:
{code:java}
Appending metadata message for group AnomalyKafkaStreams generation 12 failed
due to org.apache.kafka.common.errors.RecordTooLargeException, returning
UNKNOWN error code to the client (kafka.coordinator.group.GroupMetadataManager)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)