Issues with Kafka-based Exactly Once Processing and taskmanager.numberOfTaskSlots

Rion Williams Mon, 09 Dec 2024 06:13:50 -0800

Hi all,

In trying to optimize the performance of some of the existing Flink jobs
that are running in production environments, I've recently done some
experimenting with taking advantage of the taskmanager.numberOfTaskSlots
configuration for some of my Flink jobs and noticed an issue.


It appears when this configuration is set to a value greater than 1, the
Kafka-based producers will fail with the following error which appears to
be directly related to the configuration changes themselves:

org.apache.kafka.common.errors.TimeoutException: Timeout expired after
> 60000ms while awaiting InitProducerId


I searched through the existing Apache JIRA project to try and identify a
similar documented issue, however I didn't find anything that directly
pointed to it. Is this a potential issue/bug or is this the expected
behavior (i.e. numberOfTaskSlots must be 1 to support this).

Any advice would be appreciated!

Thanks,

Rion

Issues with Kafka-based Exactly Once Processing and taskmanager.numberOfTaskSlots

Reply via email to