Chris Egerton created KAFKA-10876:
-------------------------------------

             Summary: Duplicate connector/task create requests lead to 
incorrect FAILED status
                 Key: KAFKA-10876
                 URL: https://issues.apache.org/jira/browse/KAFKA-10876
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
            Reporter: Chris Egerton


If a Connect worker tries to start a connector or task that it is already 
running, an error will be logged and the connector/task will be marked as 
{{FAILED}}. This logic is implemented in several places:
 * 
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L257-L262]
 * 
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L299-L306]
 * 
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L511-L512]
 * 
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L570-L572]

 

Although it's certainly abnormal for a worker to run into this case and an 
{{ERROR}}-level log message is warranted when it occurs, the connector/task 
should not be marked as {{FAILED}}, as there is still an instance of that 
connector/task still running on the worker.

 

Either the worker logic should be updated to avoid marking connectors/tasks as 
{{FAILED}} in this case, or it should manually halt the existing connector/task 
before creating a new instance in its place. The first option is easier and 
more intuitive, but if it's ever possible that the already-running 
connector/task instance has an outdated configuration and the to-be-created 
connector/task has an up-to-date configuration, the second option would have 
correct behavior (while the first would not).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to