[ 
https://issues.apache.org/jira/browse/FLINK-35419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863748#comment-17863748
 ] 

Arvid Heise commented on FLINK-35419:
-------------------------------------

Looks like this could already be solved with 
[https://github.com/apache/flink-connector-kafka/pull/100/] 

> scan.bounded.latest-offset makes queries never finish if the latest message 
> is a EndTxn Kafka marker
> ----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-35419
>                 URL: https://issues.apache.org/jira/browse/FLINK-35419
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.8.0, 1.16.0, 1.17.0, 1.19.0
>            Reporter: Fabian Paul
>            Priority: Major
>
> When running the kafka connector in bounded mode, the stop condition can be 
> defined as the latest offset when the job starts. Unfortunately, Kafka's 
> latest offset calculation also includes special marker records, such as 
> transaction markers, in the overall count.
>  
> When Flink waits for a job to finish, it compares the number of records read 
> until the point with the original latest offset [1]. Since the consumer will 
> never see the special marker records, the latest offset is never reached, and 
> the job gets stuck. 
>  
> To reproduce the issue, you can write into a Kafka topic and make sure that 
> the latest record is a transaction end event. Afterwards you can start a 
> Flink job configured with `scan.bounded.latest-offset` pointing to that topic.
>  
> [1]https://github.com/confluentinc/flink/blob/59c5446c4aac0d332a21b456f4a3f82576104b80/flink-connectors/confluent-connector-kafka/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java#L128



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to