[ https://issues.apache.org/jira/browse/FLINK-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17283981#comment-17283981 ]
Thomas Weise commented on FLINK-21364: -------------------------------------- I wonder if finished splits should be communicated separately to the enumerator? Theoretically splits could finish and the reader not (yet) request new splits. Regarding the event time alignment: For the file source the split boundary may provide sufficient granularity. But for sources like Kafka and Kinesis where readers work on their splits "forever", this won't be the case. There would need to be a different mechanism to synchronize. In the old Kinesis source that is accomplished by exchanging the actual watermark information. > piggyback finishedSplitIds in RequestSplitEvent > ----------------------------------------------- > > Key: FLINK-21364 > URL: https://issues.apache.org/jira/browse/FLINK-21364 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common > Affects Versions: 1.12.1 > Reporter: Steven Zhen Wu > Priority: Major > Labels: pull-request-available > > For some split assignment strategy, the enumerator/assigner needs to track > the completed splits to advance watermark for event time alignment or rough > ordering. Right now, `RequestSplitEvent` for FLIP-27 source doesn't support > pass-along of the `finishedSplitIds` info and hence we have to create our own > custom source event type for Iceberg source. > Here is the proposal of add such optional info to `RequestSplitEvent`. > {code} > public RequestSplitEvent( > @Nullable String hostName, > @Nullable Collection<String> finishedSplitIds) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)