ok thanks for the update. That is marked as an improvement, if its a blocker 
can we mark it as such and describe why.  I searched jiras and didn't see any 
critical or blockers open.
Tom    On Tuesday, February 2, 2021, 05:12:24 PM CST, Hyukjin Kwon 
<gurwls...@gmail.com> wrote:  
 
 There is one here: https://github.com/apache/spark/pull/31440. There look 
several issues being identified (to confirm that this is an issue in OSS too), 
and fixed in parallel.
There are a bit of unexpected delays here as several issues more were found. I 
will try to file and share relevant JIRAs as soon as I can confirm.

2021년 2월 3일 (수) 오전 2:36, Tom Graves <tgraves...@yahoo.com>님이 작성:

 Just curious if we have an update on next rc? is there a jira for the tpcds 
issue?
Thanks,Tom
    On Wednesday, January 27, 2021, 05:46:27 PM CST, Hyukjin Kwon 
<gurwls...@gmail.com> wrote:  
 
 Just to share the current status, most of the known issues were resolved. Let 
me know if there are some more.
One thing left is a performance regression in TPCDS being investigated. Once 
this is identified (and fixed if it should be), I will cut another RC right 
away.
I roughly expect to cut another RC next Monday.

Thanks guys.
2021년 1월 27일 (수) 오전 5:26, Terry Kim <yumin...@gmail.com>님이 작성:

Hi,
Please check if the following regression should be included: 
https://github.com/apache/spark/pull/31352
Thanks,Terry
On Tue, Jan 26, 2021 at 7:54 AM Holden Karau <hol...@pigscanfly.ca> wrote:

If were ok waiting for it, I’d like to get 
https://github.com/apache/spark/pull/31298 in as well (it’s not a regression 
but it is a bug fix).
On Tue, Jan 26, 2021 at 6:38 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:

It looks like a cool one but it's a pretty big one and affects the plans 
considerably ... maybe it's best to avoid adding it into 3.1.1 in particular 
during the RC period if this isn't a clear regression that affects many users.
2021년 1월 26일 (화) 오후 11:23, Peter Toth <peter.t...@gmail.com>님이 작성:

Hey,
Sorry for chiming in a bit late, but I would like to suggest my PR 
(https://github.com/apache/spark/pull/28885) for review and inclusion into 
3.1.1.

Currently, invalid reuse reference nodes appear in many queries, causing 
performance issues and incorrect explain plans. Now that 
https://github.com/apache/spark/pull/31243 got merged these invalid references 
can be easily found in many of our golden files on master: 
https://github.com/apache/spark/pull/28885#issuecomment-767530441.
But the issue isn't master (3.2) specific, actually it has been there since 3.0 
when Dynamic Partition Pruning was added. 
So it is not a regression from 3.0 to 3.1.1, but in some cases (like TPCDS 
q23b) it is causing performance regression from 2.4 to 3.x.

Thanks,Peter
On Tue, Jan 26, 2021 at 6:30 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:

Guys, I plan to make an RC as soon as we have no visible issues. I have merged 
a few correctness issues. There look:
- https://github.com/apache/spark/pull/31319 waiting for a review (I will do it 
too soon).
- https://github.com/apache/spark/pull/31336
- I know Max's investigating the perf regression one which hopefully will be 
fixed soon.

Are there any more blockers or correctness issues? Please ping me or say it out 
here.
I would like to avoid making an RC when there are clearly some issues to be 
fixed.
If you're investigating something suspicious, that's fine too. It's better to 
make sure we're safe instead of rushing an RC without finishing the 
investigation.

Thanks all.


2021년 1월 22일 (금) 오후 6:19, Hyukjin Kwon <gurwls...@gmail.com>님이 작성:

Sure, thanks guys. I'll start another RC after the fixes. Looks like we're 
almost there.
On Fri, 22 Jan 2021, 17:47 Wenchen Fan, <cloud0...@gmail.com> wrote:

BTW, there is a correctness bug being fixed at 
https://github.com/apache/spark/pull/30788 . It's not a regression, but the fix 
is very simple and it would be better to start the next RC after merging that 
fix.
On Fri, Jan 22, 2021 at 3:54 PM Maxim Gekk <maxim.g...@databricks.com> wrote:

Also I am investigating a performance regression in some TPC-DS queries (q88 
for instance) that is caused by a recent commit in 3.1, highly likely in the 
period from 19th November, 2020 to 18th December, 2020.
Maxim Gekk

Software Engineer

Databricks, Inc.


On Fri, Jan 22, 2021 at 10:45 AM Wenchen Fan <cloud0...@gmail.com> wrote:

-1 as I just found a regression in 3.1. A self-join query works well in 3.0 but 
fails in 3.1. It's being fixed at https://github.com/apache/spark/pull/31287
On Fri, Jan 22, 2021 at 4:34 AM Tom Graves <tgraves...@yahoo.com.invalid> wrote:

 +1
built from tarball, verified sha and regular CI and tests all pass.
Tom
    On Monday, January 18, 2021, 06:06:42 AM CST, Hyukjin Kwon 
<gurwls...@gmail.com> wrote:  
 
 Please vote on releasing the following candidate as Apache Spark version 3.1.1.
The vote is open until January 22nd 4PM PST and passes if a majority +1 PMC 
votes are cast, with a minimum of 3 +1 votes.
[ ] +1 Release this package as Apache Spark 3.1.0[ ] -1 Do not release this 
package because ...
To learn more about Apache Spark, please see http://spark.apache.org/
The tag to be voted on is v3.1.1-rc1 (commit 
53fe365edb948d0e05a5ccb62f349cd9fcb4bb5d):https://github.com/apache/spark/tree/v3.1.1-rc1
The release files, including signatures, digests, etc. can be found 
at:https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/
Signatures used for Spark RCs can be found in this 
file:https://dist.apache.org/repos/dist/dev/spark/KEYS
The staging repository for this release can be found 
at:https://repository.apache.org/content/repositories/orgapachespark-1364
The documentation corresponding to this release can be found 
at:https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-docs/

The list of bug fixes going into 3.1.1 can be found at the following 
URL:https://s.apache.org/41kf2
This release is using the release script of the tag v3.1.1-rc1.
FAQ
===================
What happened to 3.1.0?===================

There was a technical issue during Apache Spark 3.1.0 preparation, and it was 
discussed and decided to skip 3.1.0.
Please see https://spark.apache.org/news/next-official-release-spark-3.1.1.html 
for more details.

=========================How can I help test this 
release?=========================
If you are a Spark user, you can help us test this release by takingan existing 
Spark workload and running on this release candidate, thenreporting any 
regressions.
If you're working in PySpark you can set up a virtual env and installthe 
current RC via "pip install 
https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/pyspark-3.1.1.tar.gz";
and see if anything important breaks.In the Java/Scala, you can add the staging 
repository to your projects resolvers and testwith the RC (make sure to clean 
up the artifact cache before/after soyou don't end up building with an out of 
date RC going forward).
===========================================What should happen to JIRA tickets 
still targeting 3.1.1?===========================================
The current list of open tickets targeted at 3.1.1 can be found 
at:https://issues.apache.org/jira/projects/SPARK and search for "Target 
Version/s" = 3.1.1
Committers should look at those and triage. Extremely important bugfixes, 
documentation, and API tweaks that impact compatibility shouldbe worked on 
immediately. Everything else please retarget to anappropriate release.
==================But my bug isn't fixed?==================
In order to make timely releases, we will typically not hold therelease unless 
the bug in question is a regression from the previousrelease. That being said, 
if there is something which is a regressionthat has not been correctly targeted 
please ping me or a committer tohelp target the issue.
  







-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

  
  

Reply via email to