Hello there, I managed to fix this, but I would love to understand the why of the failure... hopefully some of you can explain :-)
The production configuration has 4 kafka-streams threads, also there are about 4 instances, so its about 16 kafka-streams threads working. The production topic has 48 partitions. I wanted to debug locally our application, so since the 4 threads was making it too noisy, I changed the configuration to 1 kafka thread. I debug using a different application name (consumer-group) and connecting to the production broker, this is 48 partitions. So with one thread, the application starts and the KafkaStreams status goes to RUNNING after receiving properly the tasks assignation, etc, but after 30 seconds without doing anything, it starts throwing exceptions such as: Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 1: org.apache.kafka.common.errors.DisconnectException and org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata If I change the kafka-streams configuration to 4 threads, the problem dissappears, so I guess is related to the single thread connecting to 48 partitions and not being able to manage the kafka-connection management internals and being disconnected... but can someone explain the reason why? Thank you! Best, Javier arias losada