Hi, I'm running Kafka Streams v2.3.0. During peak hours, I noticed some nodes had Timeout exception and it will mark the node status to DEAD. Even though, I implement a CustomProductionExceptionHandler to set the ProductionExceptionHandlerResponse.CONTINUE, it doesn't solve the problem, just keep the nodes running.
Could anyone help me to understand why it throws above Timeout exception? I did a lot search and people mention it's due to producer slowness. But GC, CPU are all low. I have set request.timeout to 5 min, default.api.timeout to 3min, session.timeout to 2min, retry 1000 and retry backoff 100ms. Really appreciated any help!! Thanks.