Dear Flink devs,

I'd like to start discussion around an improvement of the exactly-once
Kafka sink. Because of its current design, the sink currently puts a
large strain on the main memory of the Kafka broker [1], which hinders
adoption of the KafkaSink. Since the KafkaSink will be the only way to
produce into Kafka for Flink 2.0, this limitation may in turn also
limit Flink 2 adoption.

I appreciate feedback not only on technical aspects but also on the
requirements of the new approach [2]:
- Are most users/customers already on Kafka 3.0 broker?
- The new approach requires READ permission on the target topic (for
DescribeProducer API) and DESCRIBE permission on the transaction (for
ListTransaction API). Are there any concerns around this?

The new behavior would be opt-in in the first phase while the current
behavior remains the DEFAULT. If the community accepts the new
behavior, we can turn it on by default and provide an option to
opt-out.

Note that this improvement is borderline FLIP-worthy; it touches the
public interfaces of the connector only. However, by doing an official
FLIP, I'm positive that we can gather feedback more quickly.

I'm also looking for volunteers that test the implementation on their
dev clusters [3].

Best,

Arvid

[1] https://issues.apache.org/jira/browse/FLINK-34554
[2] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-511%3A+Support+transaction+id+pooling+in+Kafka+connector
[3] https://github.com/apache/flink-connector-kafka/pull/154

Reply via email to