Chris Egerton created KAFKA-12497:
-------------------------------------
Summary: Source task offset commits continue even after task has
failed
Key: KAFKA-12497
URL: https://issues.apache.org/jira/browse/KAFKA-12497
Project: Kafka
Issue Type: Bug
Components: KafkaConnect
Affects Versions: 3.0.0, 2.0.2, 2.1.2, 2.2.3, 2.3.2, 2.4.2, 2.5.2, 2.8.0,
2.7.1, 2.6.2
Reporter: Chris Egerton
Assignee: Chris Egerton
Source task offset commits take place on a dedicated thread, which periodically
triggers offset commits for all of the source tasks on the worker on a
user-configurable interval and with a user-configurable timeout for each offset
commit.
When a task fails, offset commits continue to take place. In the common case
where there is no longer any chance for another successful offset commit for
the task, this has two negative side-effects:
First, confusing log messages are emitted that some users reasonably interpret
as a sign that the source task is still alive:
{noformat}
[2021-03-06 04:30:53,739] INFO
WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} Committing offsets
(org.apache.kafka.connect.runtime.WorkerSourceTask)
25[2021-03-06 04:30:53,739] INFO
WorkerSourceTask{id=Salesforce_PC_Connector_Agency-0} flushing 0 outstanding
messages for offset commit
(org.apache.kafka.connect.runtime.WorkerSourceTask){noformat}
Second, if the task has any source records pending, it will block the shared
offset commit thread until the offset commit timeout expires. This will take
place repeatedly until the either the task is restarted/deleted, or all of
these records are flushed.
In some other cases, it's actually somewhat sensible to continue to try to
commit offsets. Even if a source task has died, data from it may still be in
flight to the broker, and there's no reason not to commit the offsets for that
data once it has been ack'd.
However, if there is no in-flight data from a source task that is pending an
ack from the Kafka cluster, and the task has failed, there is no reason to
continue to try to commit offsets. Additionally, if the producer has failed to
send a record to Kafka with a non-retriable exception, there is also no reason
to continue to try to commit offsets, as the current batch will never complete.
We can address one or both of these cases to try to reduce the number of
confusing logging messages, and if necessary, alter existing log messages to
make it clear to the user that the task may not be alive.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)