Just to share the current status, most of the known issues were resolved.
Let me know if there are some more.
One thing left is a performance regression in TPCDS being investigated.
Once this is identified (and fixed if it should be), I will cut another RC
right away.
I roughly expect to cut another RC next Monday.

Thanks guys.

2021년 1월 27일 (수) 오전 5:26, Terry Kim <yumin...@gmail.com>님이 작성:

> Hi,
>
> Please check if the following regression should be included:
> https://github.com/apache/spark/pull/31352
>
> Thanks,
> Terry
>
> On Tue, Jan 26, 2021 at 7:54 AM Holden Karau <hol...@pigscanfly.ca> wrote:
>
>> If were ok waiting for it, I’d like to get
>> https://github.com/apache/spark/pull/31298 in as well (it’s not a
>> regression but it is a bug fix).
>>
>> On Tue, Jan 26, 2021 at 6:38 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>>
>>> It looks like a cool one but it's a pretty big one and affects the plans
>>> considerably ... maybe it's best to avoid adding it into 3.1.1 in
>>> particular during the RC period if this isn't a clear regression that
>>> affects many users.
>>>
>>> 2021년 1월 26일 (화) 오후 11:23, Peter Toth <peter.t...@gmail.com>님이 작성:
>>>
>>>> Hey,
>>>>
>>>> Sorry for chiming in a bit late, but I would like to suggest my PR (
>>>> https://github.com/apache/spark/pull/28885) for review and inclusion
>>>> into 3.1.1.
>>>>
>>>> Currently, invalid reuse reference nodes appear in many queries,
>>>> causing performance issues and incorrect explain plans. Now that
>>>> https://github.com/apache/spark/pull/31243 got merged these invalid
>>>> references can be easily found in many of our golden files on master:
>>>> https://github.com/apache/spark/pull/28885#issuecomment-767530441.
>>>> But the issue isn't master (3.2) specific, actually it has been there
>>>> since 3.0 when Dynamic Partition Pruning was added.
>>>> So it is not a regression from 3.0 to 3.1.1, but in some cases (like
>>>> TPCDS q23b) it is causing performance regression from 2.4 to 3.x.
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> On Tue, Jan 26, 2021 at 6:30 AM Hyukjin Kwon <gurwls...@gmail.com>
>>>> wrote:
>>>>
>>>>> Guys, I plan to make an RC as soon as we have no visible issues. I
>>>>> have merged a few correctness issues. There look:
>>>>> - https://github.com/apache/spark/pull/31319 waiting for a review (I
>>>>> will do it too soon).
>>>>> - https://github.com/apache/spark/pull/31336
>>>>> - I know Max's investigating the perf regression one which hopefully
>>>>> will be fixed soon.
>>>>>
>>>>> Are there any more blockers or correctness issues? Please ping me or
>>>>> say it out here.
>>>>> I would like to avoid making an RC when there are clearly some issues
>>>>> to be fixed.
>>>>> If you're investigating something suspicious, that's fine too. It's
>>>>> better to make sure we're safe instead of rushing an RC without finishing
>>>>> the investigation.
>>>>>
>>>>> Thanks all.
>>>>>
>>>>>
>>>>> 2021년 1월 22일 (금) 오후 6:19, Hyukjin Kwon <gurwls...@gmail.com>님이 작성:
>>>>>
>>>>>> Sure, thanks guys. I'll start another RC after the fixes. Looks like
>>>>>> we're almost there.
>>>>>>
>>>>>> On Fri, 22 Jan 2021, 17:47 Wenchen Fan, <cloud0...@gmail.com> wrote:
>>>>>>
>>>>>>> BTW, there is a correctness bug being fixed at
>>>>>>> https://github.com/apache/spark/pull/30788 . It's not a regression,
>>>>>>> but the fix is very simple and it would be better to start the next RC
>>>>>>> after merging that fix.
>>>>>>>
>>>>>>> On Fri, Jan 22, 2021 at 3:54 PM Maxim Gekk <
>>>>>>> maxim.g...@databricks.com> wrote:
>>>>>>>
>>>>>>>> Also I am investigating a performance regression in some TPC-DS
>>>>>>>> queries (q88 for instance) that is caused by a recent commit in 3.1, 
>>>>>>>> highly
>>>>>>>> likely in the period from 19th November, 2020 to 18th December, 2020.
>>>>>>>>
>>>>>>>> Maxim Gekk
>>>>>>>>
>>>>>>>> Software Engineer
>>>>>>>>
>>>>>>>> Databricks, Inc.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jan 22, 2021 at 10:45 AM Wenchen Fan <cloud0...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> -1 as I just found a regression in 3.1. A self-join query works
>>>>>>>>> well in 3.0 but fails in 3.1. It's being fixed at
>>>>>>>>> https://github.com/apache/spark/pull/31287
>>>>>>>>>
>>>>>>>>> On Fri, Jan 22, 2021 at 4:34 AM Tom Graves
>>>>>>>>> <tgraves...@yahoo.com.invalid> wrote:
>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>>
>>>>>>>>>> built from tarball, verified sha and regular CI and tests all
>>>>>>>>>> pass.
>>>>>>>>>>
>>>>>>>>>> Tom
>>>>>>>>>>
>>>>>>>>>> On Monday, January 18, 2021, 06:06:42 AM CST, Hyukjin Kwon <
>>>>>>>>>> gurwls...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please vote on releasing the following candidate as Apache Spark
>>>>>>>>>> version 3.1.1.
>>>>>>>>>>
>>>>>>>>>> The vote is open until January 22nd 4PM PST and passes if a
>>>>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>>>>>
>>>>>>>>>> [ ] +1 Release this package as Apache Spark 3.1.0
>>>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>>>
>>>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>>>> http://spark.apache.org/
>>>>>>>>>>
>>>>>>>>>> The tag to be voted on is v3.1.1-rc1 (commit
>>>>>>>>>> 53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d):
>>>>>>>>>> https://github.com/apache/spark/tree/v3.1.1-rc1
>>>>>>>>>>
>>>>>>>>>> The release files, including signatures, digests, etc. can be
>>>>>>>>>> found at:
>>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/
>>>>>>>>>>
>>>>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>>>>>
>>>>>>>>>> The staging repository for this release can be found at:
>>>>>>>>>>
>>>>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1364
>>>>>>>>>>
>>>>>>>>>> The documentation corresponding to this release can be found at:
>>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/
>>>>>>>>>>
>>>>>>>>>> The list of bug fixes going into 3.1.1 can be found at the
>>>>>>>>>> following URL:
>>>>>>>>>> https://s.apache.org/41kf2
>>>>>>>>>>
>>>>>>>>>> This release is using the release script of the tag v3.1.1-rc1.
>>>>>>>>>>
>>>>>>>>>> FAQ
>>>>>>>>>>
>>>>>>>>>> ===================
>>>>>>>>>> What happened to 3.1.0?
>>>>>>>>>> ===================
>>>>>>>>>>
>>>>>>>>>> There was a technical issue during Apache Spark 3.1.0
>>>>>>>>>> preparation, and it was discussed and decided to skip 3.1.0.
>>>>>>>>>> Please see
>>>>>>>>>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html
>>>>>>>>>> for more details.
>>>>>>>>>>
>>>>>>>>>> =========================
>>>>>>>>>> How can I help test this release?
>>>>>>>>>> =========================
>>>>>>>>>>
>>>>>>>>>> If you are a Spark user, you can help us test this release by
>>>>>>>>>> taking
>>>>>>>>>> an existing Spark workload and running on this release candidate,
>>>>>>>>>> then
>>>>>>>>>> reporting any regressions.
>>>>>>>>>>
>>>>>>>>>> If you're working in PySpark you can set up a virtual env and
>>>>>>>>>> install
>>>>>>>>>> the current RC via "pip install
>>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/pyspark-3.1.1.tar.gz
>>>>>>>>>> "
>>>>>>>>>> and see if anything important breaks.
>>>>>>>>>> In the Java/Scala, you can add the staging repository to your
>>>>>>>>>> projects resolvers and test
>>>>>>>>>> with the RC (make sure to clean up the artifact cache
>>>>>>>>>> before/after so
>>>>>>>>>> you don't end up building with an out of date RC going forward).
>>>>>>>>>>
>>>>>>>>>> ===========================================
>>>>>>>>>> What should happen to JIRA tickets still targeting 3.1.1?
>>>>>>>>>> ===========================================
>>>>>>>>>>
>>>>>>>>>> The current list of open tickets targeted at 3.1.1 can be found
>>>>>>>>>> at:
>>>>>>>>>> https://issues.apache.org/jira/projects/SPARK and search for
>>>>>>>>>> "Target Version/s" = 3.1.1
>>>>>>>>>>
>>>>>>>>>> Committers should look at those and triage. Extremely important
>>>>>>>>>> bug
>>>>>>>>>> fixes, documentation, and API tweaks that impact compatibility
>>>>>>>>>> should
>>>>>>>>>> be worked on immediately. Everything else please retarget to an
>>>>>>>>>> appropriate release.
>>>>>>>>>>
>>>>>>>>>> ==================
>>>>>>>>>> But my bug isn't fixed?
>>>>>>>>>> ==================
>>>>>>>>>>
>>>>>>>>>> In order to make timely releases, we will typically not hold the
>>>>>>>>>> release unless the bug in question is a regression from the
>>>>>>>>>> previous
>>>>>>>>>> release. That being said, if there is something which is a
>>>>>>>>>> regression
>>>>>>>>>> that has not been correctly targeted please ping me or a
>>>>>>>>>> committer to
>>>>>>>>>> help target the issue.
>>>>>>>>>>
>>>>>>>>>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>

Reply via email to