[ https://issues.apache.org/jira/browse/KAFKA-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878006#comment-17878006 ]
Bruno Cadonna commented on KAFKA-17445: --------------------------------------- [~rohitbobade] It is not clear to me what you tried to achieve by setting {{group.instance.id}}. Could yo please elaborate? Did you increase {{session.timeout.ms}} as described in the config definition (https://kafka.apache.org/documentation/#consumerconfigs_group.instance.id) Could you describe the exact steps? Did you delete the consumer group on the broker between the attempts? Was this a new Streams app or an existing one? > Kafka streams keeps rebalancing with the following reasons > ---------------------------------------------------------- > > Key: KAFKA-17445 > URL: https://issues.apache.org/jira/browse/KAFKA-17445 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 3.8.0 > Reporter: Rohit Bobade > Priority: Major > > We recently upgraded Kafka streams version to 3.8.0 and are seeing that the > streams app keeps rebalancing and does not process any events > We have explicitly set the config > GROUP_INSTANCE_ID_CONFIG > This is what we see on the broker logs: > [GroupCoordinator 2]: Preparing to rebalance group \{consumer-group-name} in > state PreparingRebalance with old generation 24781 (__consumer_offsets-29) > (reason: Updating metadata for static member {} with instance id {}; client > reason: rebalance failed due to UnjoinedGroupException) > We also tried to remove the GROUP_INSTANCE_ID_CONFIG but then see these logs > and rebalancing and no processing still > sessionTimeoutMs=45000, rebalanceTimeoutMs=1800000, > supportedProtocols=List(stream)) has left group \{groupId} through explicit > `LeaveGroup`; client reason: the consumer unsubscribed from all topics > (kafka.coordinator.group.GroupCoordinator) > other logs show: > during Stable; client reason: need to revoke partitions and re-join) > client reason: triggered followup rebalance scheduled for 0 > On the application logs we see: > 1. state being restored from changelog topic > 2. INFO org.apache.kafka.streams.processor.internals.StreamThread - > stream-thread at state RUNNING: partitions lost due to missed rebalance. > Detected that the thread is being fenced. This implies that this thread > missed a rebalance and dropped out of the consumer group. Will close out all > assigned tasks and rejoin the consumer group. > > 3. Task Migrated exceptions > org.apache.kafka.streams.errors.TaskMigratedException: Error encountered > sending record to topic > org.apache.kafka.common.errors.InvalidProducerEpochException: Producer with > transactionalId > attempted to produce with an old epoch > Written offsets would not be recorded and no more records would be sent since > the producer is fenced, indicating the task may be migrated out; it means all > tasks belonging to this thread should be migrated. > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:306) > ~[kafka-streams-3.8.0.jar:?] > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$1(RecordCollectorImpl.java:286) > ~[kafka-streams-3.8.0.jar:?] > at > datadog.trace.instrumentation.kafka_clients.KafkaProducerCallback.onCompletion(KafkaProducerCallback.java:44) > ~[?:?] > at > org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:1106) > ~[kafka-clients-3.8.0.jar:?] -- This message was sent by Atlassian Jira (v8.20.10#820010)