SPARK-32130 [1] looks to be a performance regression introduced in Spark 3.0.0, which is ideal to look into before releasing another bugfix version.
1. https://issues.apache.org/jira/browse/SPARK-32130 On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Hi all > > I just wanted to ping this thread to see if all the outstanding blockers > for 3.0.1 have been fixed. If so, it would be great if we can get the > release going. The CRAN team sent us a note that the version SparkR > available on CRAN for the current R version (4.0.2) is broken and hence we > need to update the package soon -- it will be great to do it with 3.0.1. > > Thanks > Shivaram > > On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <scrapco...@gmail.com> > wrote: > >> +1 for 3.0.1 release. >> I too can help out as release manager. >> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰 <ruife...@foxmail.com> wrote: >> >>> I volunteer to be a release manager of 3.0.1, if nobody is working on >>> this. >>> >>> >>> ------------------ 原始邮件 ------------------ >>> *发件人:* "Gengliang Wang"<gengliang.w...@databricks.com>; >>> *发送时间:* 2020年6月24日(星期三) 下午4:15 >>> *收件人:* "Hyukjin Kwon"<gurwls...@gmail.com>; >>> *抄送:* "Dongjoon Hyun"<dongjoon.h...@gmail.com>;"Jungtaek Lim"< >>> kabhwan.opensou...@gmail.com>;"Jules Damji"<dmat...@comcast.net>;"Holden >>> Karau"<hol...@pigscanfly.ca>;"Reynold Xin"<r...@databricks.com>;"Shivaram >>> Venkataraman"<shiva...@eecs.berkeley.edu>;"Yuanjian Li"< >>> xyliyuanj...@gmail.com>;"Spark dev list"<dev@spark.apache.org>;"Takeshi >>> Yamamuro"<linguin....@gmail.com>; >>> *主题:* Re: [DISCUSS] Apache Spark 3.0.1 Release >>> >>> +1, the issues mentioned are really serious. >>> >>> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon <gurwls...@gmail.com> >>> wrote: >>> >>>> +1. >>>> >>>> Just as a note, >>>> - SPARK-31918 <https://issues.apache.org/jira/browse/SPARK-31918> is >>>> fixed now, and there's no blocker. - When we build SparkR, we should use >>>> the latest R version at least 4.0.0+. >>>> >>>> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun <dongjoon.h...@gmail.com>님이 >>>> 작성: >>>> >>>>> +1 >>>>> >>>>> Bests, >>>>> Dongjoon. >>>>> >>>>> On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim < >>>>> kabhwan.opensou...@gmail.com> wrote: >>>>> >>>>>> +1 on a 3.0.1 soon. >>>>>> >>>>>> Probably it would be nice if some Scala experts can take a look at >>>>>> https://issues.apache.org/jira/browse/SPARK-32051 and include the >>>>>> fix into 3.0.1 if possible. >>>>>> Looks like APIs designed to work with Scala 2.11 & Java bring >>>>>> ambiguity in Scala 2.12 & Java. >>>>>> >>>>>> On Wed, Jun 24, 2020 at 4:52 AM Jules Damji <dmat...@comcast.net> >>>>>> wrote: >>>>>> >>>>>>> +1 (non-binding) >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> Pardon the dumb thumb typos :) >>>>>>> >>>>>>> On Jun 23, 2020, at 11:36 AM, Holden Karau <hol...@pigscanfly.ca> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> +1 on a patch release soon >>>>>>> >>>>>>> On Tue, Jun 23, 2020 at 10:47 AM Reynold Xin <r...@databricks.com> >>>>>>> wrote: >>>>>>> >>>>>>>> +1 on doing a new patch release soon. I saw some of these issues >>>>>>>> when preparing the 3.0 release, and some of them are very serious. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 23, 2020 at 8:06 AM, Shivaram Venkataraman < >>>>>>>> shiva...@eecs.berkeley.edu> wrote: >>>>>>>> >>>>>>>>> +1 Thanks Yuanjian -- I think it'll be great to have a 3.0.1 >>>>>>>>> release soon. >>>>>>>>> >>>>>>>>> Shivaram >>>>>>>>> >>>>>>>>> On Tue, Jun 23, 2020 at 3:43 AM Takeshi Yamamuro < >>>>>>>>> linguin....@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Thanks for the heads-up, Yuanjian! >>>>>>>>> >>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark >>>>>>>>> 3.0.0. >>>>>>>>> >>>>>>>>> wow, the updates are so quick. Anyway, +1 for the release. >>>>>>>>> >>>>>>>>> Bests, >>>>>>>>> Takeshi >>>>>>>>> >>>>>>>>> On Tue, Jun 23, 2020 at 4:59 PM Yuanjian Li < >>>>>>>>> xyliyuanj...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hi dev-list, >>>>>>>>> >>>>>>>>> I’m writing this to raise the discussion about Spark 3.0.1 >>>>>>>>> feasibility since 4 blocker issues were found after Spark 3.0.0: >>>>>>>>> >>>>>>>>> [SPARK-31990] The state store compatibility broken will cause a >>>>>>>>> correctness issue when Streaming query with `dropDuplicate` uses the >>>>>>>>> checkpoint written by the old Spark version. >>>>>>>>> >>>>>>>>> [SPARK-32038] The regression bug in handling NaN values in >>>>>>>>> COUNT(DISTINCT) >>>>>>>>> >>>>>>>>> [SPARK-31918][WIP] CRAN requires to make it working with the >>>>>>>>> latest R 4.0. It makes the 3.0 release unavailable on CRAN, and only >>>>>>>>> supports R [3.5, 4.0) >>>>>>>>> >>>>>>>>> [SPARK-31967] Downgrade vis.js to fix Jobs UI loading time >>>>>>>>> regression >>>>>>>>> >>>>>>>>> I also noticed branch-3.0 already has 39 commits after Spark >>>>>>>>> 3.0.0. I think it would be great if we have Spark 3.0.1 to deliver the >>>>>>>>> critical fixes. >>>>>>>>> >>>>>>>>> Any comments are appreciated. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Yuanjian >>>>>>>>> >>>>>>>>> -- >>>>>>>>> --- >>>>>>>>> Takeshi Yamamuro >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>> >>>>>>>