Chris Egerton created KAFKA-12497:
-------------------------------------

             Summary: Source task offset commits continue even after task has 
failed
                 Key: KAFKA-12497
                 URL: https://issues.apache.org/jira/browse/KAFKA-12497
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
    Affects Versions: 3.0.0, 2.0.2, 2.1.2, 2.2.3, 2.3.2, 2.4.2, 2.5.2, 2.8.0, 
2.7.1, 2.6.2
            Reporter: Chris Egerton
            Assignee: Chris Egerton


Source task offset commits take place on a dedicated thread, which periodically 
triggers offset commits for all of the source tasks on the worker on a 
user-configurable interval and with a user-configurable timeout for each offset 
commit.

 

When a task fails, offset commits continue to take place. In the common case 
where there is no longer any chance for another successful offset commit for 
the task, this has two negative side-effects:

First, confusing log messages are emitted that some users reasonably interpret 
as a sign that the source task is still alive:
{noformat}
[2021-03-06 04:30:53,739] INFO 
WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} Committing offsets 
(org.apache.kafka.connect.runtime.WorkerSourceTask)
25[2021-03-06 04:30:53,739] INFO 
WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} flushing 0 outstanding 
messages for offset commit 
(org.apache.kafka.connect.runtime.WorkerSourceTask){noformat}
Second, if the task has any source records pending, it will block the shared 
offset commit thread until the offset commit timeout expires. This will take 
place repeatedly until the either the task is restarted/deleted, or all of 
these records are flushed.

 

In some other cases, it's actually somewhat sensible to continue to try to 
commit offsets. Even if a source task has died, data from it may still be in 
flight to the broker, and there's no reason not to commit the offsets for that 
data once it has been ack'd.

 

However, if there is no in-flight data from a source task that is pending an 
ack from the Kafka cluster, and the task has failed, there is no reason to 
continue to try to commit offsets. Additionally, if the producer has failed to 
send a record to Kafka with a non-retriable exception, there is also no reason 
to continue to try to commit offsets, as the current batch will never complete.

 

We can address one or both of these cases to try to reduce the number of 
confusing logging messages, and if necessary, alter existing log messages to 
make it clear to the user that the task may not be alive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to