Chris Egerton created KAFKA-9374:
------------------------------------
Summary: Worker can be disabled by blocked connectors
Key: KAFKA-9374
URL: https://issues.apache.org/jira/browse/KAFKA-9374
Project: Kafka
Issue Type: Bug
Components: KafkaConnect
Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.2.1, 2.3.0, 2.1.1, 2.2.0, 2.1.0,
2.0.1, 2.0.0, 1.1.1, 1.1.0, 1.0.2, 1.0.1, 1.0.0
Reporter: Chris Egerton
Assignee: Chris Egerton
If a connector hangs during any of its {{initialize}}, {{start}}, {{stop}},
\{taskConfigs}}, {{taskClass}}, {{version}}, {{config}}, or {{validate}}
methods, the worker will be disabled for some types of requests thereafter,
including connector creation, connector reconfiguration, and connector deletion.
This only occurs in distributed mode and is due to the threading model used by
the
[DistributedHerder|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java]
class.
One potential solution could be to treat connectors that fail to start, stop,
etc. in time similarly to tasks that fail to stop within the [task graceful
shutdown timeout
period|https://github.com/apache/kafka/blob/03f763df8a8d9482d8c099806336f00cf2521465/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerConfig.java#L121-L126]
by handling all connector interactions on a separate thread, waiting for them
to complete within a timeout, and abandoning the thread (and transitioning the
connector to the {{FAILED}} state, if it has been created at all) if that
timeout expires.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)