Hi, I have a kafka streams application deployed on 5 nodes and with full traffic I am getting the error message:
org.apache.kafka.common.errors.InvalidProducerEpochException: Producer attempted to produce with an old epoch. Written offsets would not be recorded and no more records would be sent since the producer is fenced, indicating the task may be migrated out I have 5 x 24 CPU/48 core machines with 128gb of ram. These machines are the kafka brokers with 2x1TB disks for kafka logs and also running the kafka Streams application. 2x replication factor on topic, topic is producing about 250k per second. I have 2 aggregations in the topology to 2 output topics, the final output topics are in the 10s of k range per second. I'm assuming I have a bottleneck somewhere, I increased the broker thread counts and observed that this frequency of this error reduced, but it's still happening. Here's the broker configuration I'm using now, but I might be overshooting some of these values. num.network.threads=48 num.io.threads=48 socket.send.buffer.bytes=512000 socket.receive.buffer.bytes=512000 replica.socket.receive.buffer.bytes=1024000 socket.request.max.bytes=10485760 num.replica.fetchers=48 log.cleaner.threads=48 queued.max.requests=48000 I can't find good documentation on the effect of broker configuration on performance.