Hi all, Could I get some input on the severity of this one that I found yesterday? If that’s a correctness issue, should it block this patch? Let me know under the ticket if there’s more info that I can provide to help.
https://issues.apache.org/jira/browse/SPARK-32136 Thanks, Jason. From: Jungtaek Lim <[email protected]> Date: Wednesday, 1 July 2020 at 10:20 am To: Shivaram Venkataraman <[email protected]> Cc: Prashant Sharma <[email protected]>, 郑瑞峰 <[email protected]>, Gengliang Wang <[email protected]>, gurwls223 <[email protected]>, Dongjoon Hyun <[email protected]>, Jules Damji <[email protected]>, Holden Karau <[email protected]>, Reynold Xin <[email protected]>, Yuanjian Li <[email protected]>, "[email protected]" <[email protected]>, Takeshi Yamamuro <[email protected]> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version. 1. https://issues.apache.org/jira/browse/SPARK-32130 On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <[email protected]<mailto:[email protected]>> wrote: Hi all I just wanted to ping this thread to see if all the outstanding blockers for 3.0.1 have been fixed. If so, it would be great if we can get the release going. The CRAN team sent us a note that the version SparkR available on CRAN for the current R version (4.0.2) is broken and hence we need to update the package soon -- it will be great to do it with 3.0.1. Thanks Shivaram On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <[email protected]<mailto:[email protected]>> wrote: +1 for 3.0.1 release. I too can help out as release manager. On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <[email protected]<mailto:[email protected]>> wrote: I volunteer to be a release manager of 3.0.1, if nobody is working on this. ------------------ 原始邮件 ------------------ 发件人: "Gengliang Wang"<[email protected]<mailto:[email protected]>>; 发送时间: 2020年6月24日(星期三) 下午4:15 收件人: "Hyukjin Kwon"<[email protected]<mailto:[email protected]>>; 抄送: "Dongjoon Hyun"<[email protected]<mailto:[email protected]>>;"Jungtaek Lim"<[email protected]<mailto:[email protected]>>;"Jules Damji"<[email protected]<mailto:[email protected]>>;"Holden Karau"<[email protected]<mailto:[email protected]>>;"Reynold Xin"<[email protected]<mailto:[email protected]>>;"Shivaram Venkataraman"<[email protected]<mailto:[email protected]>>;"Yuanjian Li"<[email protected]<mailto:[email protected]>>;"Spark dev list"<[email protected]<mailto:[email protected]>>;"Takeshi Yamamuro"<[email protected]<mailto:[email protected]>>; 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release +1, the issues mentioned are really serious. On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <[email protected]<mailto:[email protected]>> wrote: +1. Just as a note, - SPARK-31918<https://issues.apache.org/jira/browse/SPARK-31918> is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+. 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <[email protected]<mailto:[email protected]>>님이 작성: +1 Bests, Dongjoon. On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <[email protected]<mailto:[email protected]>> wrote: +1 on a 3.0.1 soon. Probably it would be nice if some Scala experts can take a look at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix into 3.0.1 if possible. Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in Scala 2.12 & Java. On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <[email protected]<mailto:[email protected]>> wrote: +1 (non-binding) Sent from my iPhone Pardon the dumb thumb typos :) On Jun 23, 2020, at 11:36 AM, Holden Karau <[email protected]<mailto:[email protected]>> wrote: +1 on a patch release soon On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <[email protected]<mailto:[email protected]>> wrote: Error! Filename not specified. +1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious. On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <[email protected]<mailto:[email protected]>> wrote: +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon. Shivaram On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <[email protected]<mailto:[email protected]>> wrote: Thanks for the heads-up, Yuanjian! I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. wow, the updates are so quick. Anyway, +1 for the release. Bests, Takeshi On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <[email protected]<mailto:[email protected]>> wrote: Hi dev-list, I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0: [SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version. [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT) [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0) [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes. Any comments are appreciated. Best, Yuanjian -- --- Takeshi Yamamuro --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]<mailto:[email protected]> -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
