Chris Egerton created KAFKA-9374: ------------------------------------ Summary: Worker can be disabled by blocked connectors Key: KAFKA-9374 URL: https://issues.apache.org/jira/browse/KAFKA-9374 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.2.1, 2.3.0, 2.1.1, 2.2.0, 2.1.0, 2.0.1, 2.0.0, 1.1.1, 1.1.0, 1.0.2, 1.0.1, 1.0.0 Reporter: Chris Egerton Assignee: Chris Egerton
If a connector hangs during any of its {{initialize}}, {{start}}, {{stop}}, \{taskConfigs}}, {{taskClass}}, {{version}}, {{config}}, or {{validate}} methods, the worker will be disabled for some types of requests thereafter, including connector creation, connector reconfiguration, and connector deletion. This only occurs in distributed mode and is due to the threading model used by the [DistributedHerder|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java] class. One potential solution could be to treat connectors that fail to start, stop, etc. in time similarly to tasks that fail to stop within the [task graceful shutdown timeout period|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java#L121-L126] by handling all connector interactions on a separate thread, waiting for them to complete within a timeout, and abandoning the thread (and transitioning the connector to the {{FAILED}} state, if it has been created at all) if that timeout expires. -- This message was sent by Atlassian Jira (v8.3.4#803005)