Hi, in documentation for SplitEnumeratorContext::callAsync method we read that:
"(...) When this method is invoked multiple times, The Callables may be executed in a thread pool concurrently. It is important to make sure that the callable does not modify any shared state, especially the states that will be a part of the SplitEnumerator.snapshotState(long) (...)" The ContinuousHiveSplitEnumerator::start method exectutes below code: enumeratorContext.callAsync( monitor, this::handleNewSplits, discoveryInterval, discoveryInterval); The handleNewSplits method does this: this.seenPartitionsSinceOffset = newSplitsAndState.seenPartitions; My question is, on how many threads "monitor" callable will be executed? If more than one (and this is possible accordingly to the callAsync documentation) then I think that this reference assignment in handleNewSplits method could lead to bugs. Since both threads that were executing monitor callable can return totally different collection. For ContinuousFileSplitEnumerator this was resolved in a different way. Filtering of already processed paths is done by Source-Coordinator thread and there is no reference assignment. Regards, Krzysztof Chmielewski