Based on the plan, the changes in that PR added the extra Aggregate and Expand for common queries:
SELECT sum(DISTINCT x), avg(DISTINCT x) FROM tab Both Aggregate and Expand are expensive operators. 2018-06-01 13:24 GMT-07:00 Sean Owen <sro...@gmail.com>: > Hm, that was merged two days ago, and you decided to revert it 2 hours ago. > > It sounds like this was maybe risky to put into 2.3.x during the RC phase, > at least. > You also don't seem certain whether there's a performance problem; how > sure are you? > > These may all have been the right thing to do given available info, but > this does seem like too much rapid change at this stage of an RC. > > On Fri, Jun 1, 2018 at 3:20 PM Xiao Li <gatorsm...@gmail.com> wrote: > >> Sorry, I need to say -1 >> >> This morning, just found a regression in 2.3.1 and reverted >> https://github.com/apache/spark/pull/21443 >> >> Xiao >> >> 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin <van...@cloudera.com>: >> >>> Please vote on releasing the following candidate as Apache Spark version >>> 2.3.1. >>> >>> Given that I expect at least a few people to be busy with Spark Summit >>> next >>> week, I'm taking the liberty of setting an extended voting period. The >>> vote >>> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT). >>> >>> It passes with a majority of +1 votes, which must include at least 3 +1 >>> votes >>> from the PMC. >>> >>> [ ] +1 Release this package as Apache Spark 2.3.1 >>> [ ] -1 Do not release this package because ... >>> >>> To learn more about Apache Spark, please see http://spark.apache.org/ >>> >>> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b): >>> https://github.com/apache/spark/tree/v2.3.1-rc3 >>> >>> The release files, including signatures, digests, etc. can be found at: >>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/ >>> >>> Signatures used for Spark RCs can be found in this file: >>> https://dist.apache.org/repos/dist/dev/spark/KEYS >>> >>> The staging repository for this release can be found at: >>> https://repository.apache.org/content/repositories/orgapachespark-1271/ >>> >>> The documentation corresponding to this release can be found at: >>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/ >>> >>> The list of bug fixes going into 2.3.1 can be found at the following URL: >>> https://issues.apache.org/jira/projects/SPARK/versions/12342432 >>> >>> FAQ >>> >>> ========================= >>> How can I help test this release? >>> ========================= >>> >>> If you are a Spark user, you can help us test this release by taking >>> an existing Spark workload and running on this release candidate, then >>> reporting any regressions. >>> >>> If you're working in PySpark you can set up a virtual env and install >>> the current RC and see if anything important breaks, in the Java/Scala >>> you can add the staging repository to your projects resolvers and test >>> with the RC (make sure to clean up the artifact cache before/after so >>> you don't end up building with a out of date RC going forward). >>> >>> =========================================== >>> What should happen to JIRA tickets still targeting 2.3.1? >>> =========================================== >>> >>> The current list of open tickets targeted at 2.3.1 can be found at: >>> https://s.apache.org/Q3Uo >>> >>> Committers should look at those and triage. Extremely important bug >>> fixes, documentation, and API tweaks that impact compatibility should >>> be worked on immediately. Everything else please retarget to an >>> appropriate release. >>> >>> ================== >>> But my bug isn't fixed? >>> ================== >>> >>> In order to make timely releases, we will typically not hold the >>> release unless the bug in question is a regression from the previous >>> release. That being said, if there is something which is a regression >>> that has not been correctly targeted please ping me or a committer to >>> help target the issue. >>> >>> >>> -- >>> Marcelo >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >>