This is why RoundRobinPartitioning shouldn't be used ...
On Sat, Mar 12, 2022 at 12:08 PM, Jason Xu < jasonxu.sp...@gmail.com > wrote:
>
> Hi Spark community,
>
> I reported a data correctness issue in https:/ / issues. apache. org/ jira/
> browse/ SPARK-38388 ( https://issues.apache.org/jira/b
Hi Spark community,
I reported a data correctness issue in
https://issues.apache.org/jira/browse/SPARK-38388. In short,
non-deterministic data + Repartition + FetchFailure could result in
incorrect data, this is an issue we run into in production pipelines, I
have an example to reproduce the bug i