Hi all, Could I get some input on the severity of this one that I found yesterday? If that’s a correctness issue, should it block this patch? Let me know under the ticket if there’s more info that I can provide to help.
https://issues.apache.org/jira/browse/SPARK-32136 Thanks, Jason. From: Jungtaek Lim <kabhwan.opensou...@gmail.com> Date: Wednesday, 1 July 2020 at 10:20 am To: Shivaram Venkataraman <shiva...@eecs.berkeley.edu> Cc: Prashant Sharma <scrapco...@gmail.com>, 郑瑞峰 <ruife...@foxmail.com>, Gengliang Wang <gengliang.w...@databricks.com>, gurwls223 <gurwls...@gmail.com>, Dongjoon Hyun <dongjoon.h...@gmail.com>, Jules Damji <dmat...@comcast.net>, Holden Karau <hol...@pigscanfly.ca>, Reynold Xin <r...@databricks.com>, Yuanjian Li <xyliyuanj...@gmail.com>, "dev@spark.apache.org" <dev@spark.apache.org>, Takeshi Yamamuro <linguin....@gmail.com> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version. 1. https://issues.apache.org/jira/browse/SPARK-32130 On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <shiva...@eecs.berkeley.edu<mailto:shiva...@eecs.berkeley.edu>> wrote: Hi all I just wanted to ping this thread to see if all the outstanding blockers for 3.0.1 have been fixed. If so, it would be great if we can get the release going. The CRAN team sent us a note that the version SparkR available on CRAN for the current R version (4.0.2) is broken and hence we need to update the package soon -- it will be great to do it with 3.0.1. Thanks Shivaram On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <scrapco...@gmail.com<mailto:scrapco...@gmail.com>> wrote: +1 for 3.0.1 release. I too can help out as release manager. On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ruife...@foxmail.com<mailto:ruife...@foxmail.com>> wrote: I volunteer to be a release manager of 3.0.1, if nobody is working on this. ------------------ 原始邮件 ------------------ 发件人: "Gengliang Wang"<gengliang.w...@databricks.com<mailto:gengliang.w...@databricks.com>>; 发送时间: 2020年6月24日(星期三) 下午4:15 收件人: "Hyukjin Kwon"<gurwls...@gmail.com<mailto:gurwls...@gmail.com>>; 抄送: "Dongjoon Hyun"<dongjoon.h...@gmail.com<mailto:dongjoon.h...@gmail.com>>;"Jungtaek Lim"<kabhwan.opensou...@gmail.com<mailto:kabhwan.opensou...@gmail.com>>;"Jules Damji"<dmat...@comcast.net<mailto:dmat...@comcast.net>>;"Holden Karau"<hol...@pigscanfly.ca<mailto:hol...@pigscanfly.ca>>;"Reynold Xin"<r...@databricks.com<mailto:r...@databricks.com>>;"Shivaram Venkataraman"<shiva...@eecs.berkeley.edu<mailto:shiva...@eecs.berkeley.edu>>;"Yuanjian Li"<xyliyuanj...@gmail.com<mailto:xyliyuanj...@gmail.com>>;"Spark dev list"<dev@spark.apache.org<mailto:dev@spark.apache.org>>;"Takeshi Yamamuro"<linguin....@gmail.com<mailto:linguin....@gmail.com>>; 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release +1, the issues mentioned are really serious. On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gurwls...@gmail.com<mailto:gurwls...@gmail.com>> wrote: +1. Just as a note, - SPARK-31918<https://issues.apache.org/jira/browse/SPARK-31918> is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+. 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <dongjoon.h...@gmail.com<mailto:dongjoon.h...@gmail.com>>님이 작성: +1 Bests, Dongjoon. On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim <kabhwan.opensou...@gmail.com<mailto:kabhwan.opensou...@gmail.com>> wrote: +1 on a 3.0.1 soon. Probably it would be nice if some Scala experts can take a look at https://issues.apache.org/jira/browse/SPARK-32051 and include the fix into 3.0.1 if possible. Looks like APIs designed to work with Scala 2.11 & Java bring ambiguity in Scala 2.12 & Java. On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dmat...@comcast.net<mailto:dmat...@comcast.net>> wrote: +1 (non-binding) Sent from my iPhone Pardon the dumb thumb typos :) On Jun 23, 2020, at 11:36 AM, Holden Karau <hol...@pigscanfly.ca<mailto:hol...@pigscanfly.ca>> wrote: +1 on a patch release soon On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <r...@databricks.com<mailto:r...@databricks.com>> wrote: Error! Filename not specified. +1 on doing a new patch release soon. I saw some of these issues when preparing the 3.0 release, and some of them are very serious. On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman <shiva...@eecs.berkeley.edu<mailto:shiva...@eecs.berkeley.edu>> wrote: +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 release soon. Shivaram On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro <linguin....@gmail.com<mailto:linguin....@gmail.com>> wrote: Thanks for the heads-up, Yuanjian! I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. wow, the updates are so quick. Anyway, +1 for the release. Bests, Takeshi On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li <xyliyuanj...@gmail.com<mailto:xyliyuanj...@gmail.com>> wrote: Hi dev-list, I’m writing this to raise the discussion about Spark 3.0.1 feasibility since 4 blocker issues were found after Spark 3.0.0: [SPARK-31990] The state store compatibility broken will cause a correctness issue when Streaming query with `dropDuplicate` uses the checkpoint written by the old Spark version. [SPARK-32038] The regression bug in handling NaN values in COUNT(DISTINCT) [SPARK-31918][WIP] CRAN requires to make it working with the latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only supports R [3.5, 4.0) [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time regression I also noticed branch-3.0 already has 39 commits after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the critical fixes. Any comments are appreciated. Best, Yuanjian -- --- Takeshi Yamamuro --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org> -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau