Hi all,

Spark 2.3.1 was released just a while ago, but unfortunately we discovered
and fixed some critical issues afterward.

*SPARK-24495: SortMergeJoin may produce wrong result.*
This is a serious correctness bug, and is easy to hit: have duplicated join
key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the
join is a sort merge join. This bug is only present in Spark 2.3.

*SPARK-24588: stream-stream join may produce wrong result*
This is a correctness bug in a new feature of Spark 2.3: the stream-stream
join. Users can hit this bug if one of the join side is partitioned by a
subset of the join keys.

*SPARK-24552: Task attempt numbers are reused when stages are retried*
This is a long-standing bug in the output committer that may introduce data
corruption.

*SPARK-24542: UDFXPathXXXX allow users to pass carefully crafted XML to
access arbitrary files*
This is a potential security issue if users build access control module
upon Spark.

I think we need a Spark 2.3.2 to address these issues(especially the
correctness bugs) ASAP. Any thoughts?

Thanks,
Wenchen

Reply via email to