Hello DInesh, Seems you're still using version 1.0 of Kafka Streams with EOS enabled. Could you try to upgrade to a newer version (2.6+) and see if this issue goes away?
On Mon, May 3, 2021 at 8:55 AM Dinesh Raj <dinu.i...@gmail.com> wrote: > Hi, > > I am getting too many exceptions whenever the kafka streams application is > scaled out or scaled down. > > org.apache.kafka.common.errors.InvalidProducerEpochException: Producer > > attempted to produce with an old epoch. > > Written offsets would not be recorded and no more records would be sent > > since the producer is fenced, indicating the task may be migrated out; it > > means all tasks belonging to this thread should be migrated. > > at > > > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:206) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.lambda$send$0(RecordCollectorImpl.java:187) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1366) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:231) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:197) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:690) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:676) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:634) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:568) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.lambda$sendProduceRequest$0(Sender.java:757) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:109) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:584) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:576) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > > > org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:325) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:240) > > ~[RTBKafkaStreams-1.0-SNAPSHOT-all.jar:?] > > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131] > > Caused by: org.apache.kafka.common.errors.InvalidProducerEpochException: > > Producer attempted to produce with an old epoch. > > 13:35:23.343 > > > [bids_kafka_streams_beta_0001-a1ab0588-b1c2-4cd9-b02a-ccfed7594237-StreamThread-15] > > ERROR org.apache.kafka.clients.consumer.internals.AbstractCoordinator - > > [Consumer > > > clientId=bids_kafka_streams_beta_0001-a1ab0588-b1c2-4cd9-b02a-ccfed7594237-StreamThread-15-consumer, > > groupId=bids_kafka_streams_beta_0001] LeaveGroup request with > > Generation{generationId=169, > > > memberId='bids_kafka_streams_beta_0001-a1ab0588-b1c2-4cd9-b02a-ccfed7594237-StreamThread-15-consumer-bc1dee7e-dad7-4eff-a251-6690dedb4493', > > protocol='stream'} failed with error: The coordinator is not aware of > this > > member. > > > > I think this exception is not fatal as the topology keeps on running but > surely it's slowed down. And it never settles, these exceptions keep coming > even after half an hour. My understanding is that once rebalancing is > completed and tasks are migrated, I should stop getting this exception. > Please help. > > Thanks & Regards, > Dinesh Raj > -- -- Guozhang