[ https://issues.apache.org/jira/browse/KAFKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020100#comment-16020100 ]
Randall Hauch edited comment on KAFKA-3821 at 5/22/17 8:13 PM: --------------------------------------------------------------- The problem with the connector directly using {{OffsetStorageWriter}} is that it cannot guarantee order relative to the source records that Kafka Connect is already processing. In my cases, the offset/partition should be updated as part of the sequence of normal source records, and that order must be maintained. The best and simplest example is a connector that still wants to record that it is still making progress in its source, but for whatever reason is not producing any source records. But imagine a case where the connector just recorded an offset via {{OffsetStorageWriter}} and then immediately produces a new {{SourceRecord}} with a new offset. This order is important, and it's really bad if the offset of the {{SourceRecord}} gets written before the connector's call. Of course, the opposite case is bad, too: imagine the connector producing {{SourceRecord}} that is enqueued and not immediately processed, but the connector progresses a bit and wants to record its new offset. If it did the latter by explicit writing to the {{OffsetStorageWriter}}, that might happen before the offset in the {{SourceRecord}} is captured. Bottom line is that connectors need to be able to specify the order of {{SourceRecords}} and offset updates, and that likely means they all need to be sent through the same poll mechanism. was (Author: rhauch): The problem with the connector directly using {{OffsetStorageWriter}} is that it cannot guarantee order relative to the source records that Kafka Connect is already processing. In my cases, the offset/partition should be updated as part of the sequence of normal source records, and that order must be maintained. The best and simplest example is a connector that still wants to record that it is still making progress in its source, but for whatever reason is not producing any source records. > Allow Kafka Connect source tasks to produce offset without writing to topics > ---------------------------------------------------------------------------- > > Key: KAFKA-3821 > URL: https://issues.apache.org/jira/browse/KAFKA-3821 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect > Affects Versions: 0.9.0.1 > Reporter: Randall Hauch > Labels: needs-kip > > Provide a way for a {{SourceTask}} implementation to record a new offset for > a given partition without necessarily writing a source record to a topic. > Consider a connector task that uses the same offset when producing an unknown > number of {{SourceRecord}} objects (e.g., it is taking a snapshot of a > database). Once the task completes those records, the connector wants to > update the offsets (e.g., the snapshot is complete) but has no more records > to be written to a topic. With this change, the task could simply supply an > updated offset. -- This message was sent by Atlassian JIRA (v6.3.15#6346)