[ https://issues.apache.org/jira/browse/FLINK-33153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martijn Visser updated FLINK-33153: ----------------------------------- Affects Version/s: (was: kafka-4.1.0) > Kafka using latest-offset maybe missing data > -------------------------------------------- > > Key: FLINK-33153 > URL: https://issues.apache.org/jira/browse/FLINK-33153 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka > Reporter: tanjialiang > Priority: Minor > > When Kafka start with the latest-offset strategy, it does not fetch the > latest snapshot offset and specify it for consumption. Instead, it sets the > startingOffset to -1 (KafkaPartitionSplit.LATEST_OFFSET, which makes > currentOffset = -1, and call the KafkaConsumer's seekToEnd API). The > currentOffset is only set to the consumed offset + 1 when the task consumes > data, and this currentOffset is stored in the state during checkpointing. If > there are very few messages in Kafka and a partition has not consumed any > data, and I stop the task with a savepoint, then write data to that > partition, and start the task with the savepoint, the task will resume from > the saved state. Due to the startingOffset in the state being -1, it will > cause the task to miss the data that was written before the recovery point. -- This message was sent by Atlassian Jira (v8.20.10#820010)