[ 
https://issues.apache.org/jira/browse/FLINK-39218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Cranmer reassigned FLINK-39218:
-------------------------------------

    Assignee: Aleksandr Savonin

> Add CLI tool to manage lingering Kafka transactions
> ---------------------------------------------------
>
>                 Key: FLINK-39218
>                 URL: https://issues.apache.org/jira/browse/FLINK-39218
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kafka
>            Reporter: Aleksandr Savonin
>            Assignee: Aleksandr Savonin
>            Priority: Major
>
> When a Flink job using the KafkaSink with EXACTLY_ONCE delivery guarantee 
> stops unexpectedly (e.g., crash), it can leave Kafka transactions in an 
> ONGOING state. These lingering transactions block downstream consumers 
> operating in read_committed isolation level, as the Last Stable Offset (LSO) 
> cannot advance past them.
> Currently, there is no built-in way to resolve this without restarting the 
> original Flink job or waiting for the transaction timeout.
> This ticket adds a dedicated CLI tool 
> (flink-connector-kafka-transaction-tool) packaged as a self-contained 
> uber-jar that allows operators to manually resolve stuck transactions:
>  * Abort: Connects with the same transactional.id to fence the previous 
> producer, forcing the broker to abort the open transaction.
>  * Commit: Resumes a specific transaction using the exact producerId and 
> epoch from Flink checkpoint state/logs and commits it.
> The tool is implemented as a new Maven module 
> (flink-connector-kafka-transaction-tool).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to