AHeise commented on code in PR #100: URL: https://github.com/apache/flink-connector-kafka/pull/100#discussion_r1760653304
########## flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java: ########## @@ -122,32 +122,32 @@ public RecordsWithSplitIds<ConsumerRecord<byte[], byte[]>> fetch() throws IOExce KafkaPartitionSplitRecords recordsBySplits = new KafkaPartitionSplitRecords(consumerRecords, kafkaSourceReaderMetrics); List<TopicPartition> finishedPartitions = new ArrayList<>(); - for (TopicPartition tp : consumerRecords.partitions()) { + for (TopicPartition tp : consumer.assignment()) { long stoppingOffset = getStoppingOffset(tp); - final List<ConsumerRecord<byte[], byte[]>> recordsFromPartition = - consumerRecords.records(tp); - - if (recordsFromPartition.size() > 0) { - final ConsumerRecord<byte[], byte[]> lastRecord = - recordsFromPartition.get(recordsFromPartition.size() - 1); - - // After processing a record with offset of "stoppingOffset - 1", the split reader - // should not continue fetching because the record with stoppingOffset may not - // exist. Keep polling will just block forever. - if (lastRecord.offset() >= stoppingOffset - 1) { - recordsBySplits.setPartitionStoppingOffset(tp, stoppingOffset); - finishSplitAtRecord( - tp, - stoppingOffset, - lastRecord.offset(), - finishedPartitions, - recordsBySplits); - } + long consumerPosition = consumer.position(tp); + // Stop fetching when the consumer's position reaches the stoppingOffset. + // Control messages may follow the last record; therefore, using the last record's + // offset as a stopping condition could result in indefinite blocking. Review Comment: I think I understand now: * If I use stopping offset `endOffset` and it picks let's say 100, it would abort reading at 100 and sets the stopping offset of the partition to 100. * The record 100 has been read and is part of the batch. It will not be filtered out. * So it indeed appears as the last record of the batch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org