[ https://issues.apache.org/jira/browse/FLINK-31008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17687046#comment-17687046 ]
ming li commented on FLINK-31008: --------------------------------- [~lzljs3620320] I have created a pull request, please review it if you have time. Thanks. > [Flink][Table Store] The Split allocation of the same bucket in > ContinuousFileSplitEnumerator may be out of order > ----------------------------------------------------------------------------------------------------------------- > > Key: FLINK-31008 > URL: https://issues.apache.org/jira/browse/FLINK-31008 > Project: Flink > Issue Type: Bug > Components: Table Store > Reporter: ming li > Assignee: ming li > Priority: Major > Labels: pull-request-available > > There are two places in {{ContinuousFileSplitEnumerator}} that add > {{FileStoreSourceSplit}} to {{{}bucketSplits{}}}: {{addSplitsBack}} and > {{{}processDiscoveredSplits{}}}. {{processDiscoveredSplits}} will > continuously check for new splits and add them to the queue. At this time, > the order of the splits is in order. > {code:java} > private void addSplits(Collection<FileStoreSourceSplit> splits) { > splits.forEach(this::addSplit); > } > private void addSplit(FileStoreSourceSplit split) { > bucketSplits > .computeIfAbsent(((DataSplit) split.split()).bucket(), i -> new > LinkedList<>()) > .add(split); > }{code} > However, when the task failover, the splits that have been allocated before > will be returned. At this time, these returned splits are also added to the > end of the queue, which leads to disorder in the allocation of splits. > > I think these returned splits should be added to the head of the queue to > ensure the order of allocation. -- This message was sent by Atlassian Jira (v8.20.10#820010)