C0urante opened a new pull request #10907:
URL: https://github.com/apache/kafka/pull/10907


   Implements 
[KIP-618](https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors#KIP618:ExactlyOnceSupportforSourceConnectors-Newmetrics).
   
   There are several changes here that can be reviewed fairly independently of 
each other:
   - Support for transactional source tasks, which is largely implemented in 
the `ExactlyOnceWorkerSourceTask` class, its newly-introduced 
`AbstractWorkerSourceTask` superclass, the `Worker` class (whose API for task 
starts has been split up from the existing `startTask` method into separate 
`startSinkTask`, `startSourceTask`, and `startExactlyOnceSourceTask` methods), 
and the `WorkerTransactionContext` class (which is used to allow connectors to 
define their own transaction boundaries)
   - Zombie fencing logic and the use of a transactional producer for some 
writes to the config topic, which are done by the leader of the cluster and are 
largely implemented in the `DistributedHerder`, `ConfigBackingStore`, 
`KafkaConfigBackingStore`, and `ClusterConfigState` classes
   - A new method in the `Admin` API for fencing out transactional producers by 
ID, which is done with changes to the `Admin` interface (unsurprisingly) and 
the `KafkaAdminClient` class
   - Support for per-connector offsets topics, which touches on the `Worker`, 
`OffsetStorageReaderImpl`, and `OffsetStorageWriter` classes
   - A few new `SourceConnector` methods for communicating support for 
exactly-once guarantees and connector-defined transactions; these take place in 
the `SourceConnector` class (also unsurprisingly) and the `AbstractHerder` class
   
   Existing unit tests are expanded where applicable, and new ones have been 
introduced where necessary.
   
   Eight new integration tests are added, which cover scenarios including 
preflight validation checks, all three types of transaction boundary, graceful 
recovery of the leader when fenced out from the config topic by another worker, 
ensuring that the correct number of task producers are fenced out across 
generations, accurate reporting of failure to bring up tasks when fencing does 
not succeed (includes an ACL-secured embedded Kafka cluster to simulate one of 
the most likely potential causes of this issue--insufficient permissions on the 
targeted Kafka cluster), and the use of a custom offsets topic.
   
   Many but not all existing system tests are modified to add cases involving 
exactly-once source support, which helps give us reasonable confidence that the 
feature is agnostic with regards to rebalance protocol. A new test is added 
that is based on the existing bounce test, but with no sink connector and with 
stricter expectations for delivery guarantees (no duplicates are permitted).
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to