Alex Sorokoumov created FLINK-31408: ---------------------------------------
Summary: Add EXACTLY_ONCE support to upsert-kafka Key: FLINK-31408 URL: https://issues.apache.org/jira/browse/FLINK-31408 Project: Flink Issue Type: New Feature Components: Connectors / Kafka Reporter: Alex Sorokoumov {{upsert-kafka}} connector should support optional {{EXACTLY_ONCE}} delivery semantics. [upsert-kafka docs|https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/upsert-kafka/#consistency-guarantees] suggest that the connector handles duplicate records from {{{}AT_LEAST_ONCE{}}}. However, at least 2 reasons exist to configure the connector with {{{}EXACTLY_ONCE{}}}. First, there might be other non-Flink topic consumers that would rather not have duplicated records. Second, multiple {{upsert-kafka}} producers might cause keys to roll back to previous values. Consider a scenario with 2 producing jobs A and B, writing to the same topic with {{AT_LEAST_ONCE}} and a consuming job reading from the topic. Both producers write unique, monotonically increasing sequences to the same key. Job A writes {{x=a1,a2,a3,a4,a5…}} Job B writes {{{}x=b1,b2,b3,b4,b5,...{}}}. With this setup, we can have the following sequence: # Job A produces x=a5. # Job B produces x=b5. # Job A produces the duplicate write x= 5. The consuming job would observe {{x}} going to {{{}a5{}}}, then to {{{}b5{}}}, then back {{{}a5{}}}. {{EXACTLY_ONCE}} would prevent this behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)