Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

Josh Rosen Fri, 12 Dec 2014 20:02:06 -0800

+1.  Tested using spark-perf and the Spark EC2 scripts.  I didn’t notice any 
performance regressions that could not be attributed to changes of default 
configurations.  To be more specific, when running Spark 1.2.0 with the Spark 
1.1.0 settings of spark.shuffle.manager=hash and 
spark.shuffle.blockTransferService=nio, there was no performance regression 
and, in fact, there were significant performance improvements for some 
workloads.


In Spark 1.2.0, the new default settings are spark.shuffle.manager=sort and 
spark.shuffle.blockTransferService=netty.  With these new settings, I noticed a 
performance regression in the scala-sort-by-key-int spark-perf test.  However, 
Spark 1.1.0 and 1.1.1 exhibit a similar performance regression for that same 
test when run with spark.shuffle.manager=sort, so this regression seems 
explainable by the change of defaults.  Besides this, most of the other tests 
ran at the same speeds or faster with the new 1.2.0 defaults.  Also, keep in 
mind that this is a somewhat artificial micro benchmark; I have heard anecdotal 
reports from many users that their real workloads have run faster with 1.2.0.

Based on these results, I’m comfortable giving a +1 on 1.2.0 RC2.

- Josh

On December 11, 2014 at 9:52:39 AM, Sandy Ryza (sandy.r...@cloudera.com) wrote:

+1 (non-binding). Tested on Ubuntu against YARN.  

On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin <r...@databricks.com> wrote:  

> +1  
>  
> Tested on OS X.  
>  
> On Wednesday, December 10, 2014, Patrick Wendell <pwend...@gmail.com>  
> wrote:  
>  
> > Please vote on releasing the following candidate as Apache Spark version  
> > 1.2.0!  
> >  
> > The tag to be voted on is v1.2.0-rc2 (commit a428c446e2):  
> >  
> >  
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=a428c446e23e628b746e0626cc02b7b3cadf588e
>   
> >  
> > The release files, including signatures, digests, etc. can be found at:  
> > http://people.apache.org/~pwendell/spark-1.2.0-rc2/  
> >  
> > Release artifacts are signed with the following key:  
> > https://people.apache.org/keys/committer/pwendell.asc  
> >  
> > The staging repository for this release can be found at:  
> > https://repository.apache.org/content/repositories/orgapachespark-1055/  
> >  
> > The documentation corresponding to this release can be found at:  
> > http://people.apache.org/~pwendell/spark-1.2.0-rc2-docs/  
> >  
> > Please vote on releasing this package as Apache Spark 1.2.0!  
> >  
> > The vote is open until Saturday, December 13, at 21:00 UTC and passes  
> > if a majority of at least 3 +1 PMC votes are cast.  
> >  
> > [ ] +1 Release this package as Apache Spark 1.2.0  
> > [ ] -1 Do not release this package because ...  
> >  
> > To learn more about Apache Spark, please see  
> > http://spark.apache.org/  
> >  
> > == What justifies a -1 vote for this release? ==  
> > This vote is happening relatively late into the QA period, so  
> > -1 votes should only occur for significant regressions from  
> > 1.0.2. Bugs already present in 1.1.X, minor  
> > regressions, or bugs related to new features will not block this  
> > release.  
> >  
> > == What default changes should I be aware of? ==  
> > 1. The default value of "spark.shuffle.blockTransferService" has been  
> > changed to "netty"  
> > --> Old behavior can be restored by switching to "nio"  
> >  
> > 2. The default value of "spark.shuffle.manager" has been changed to  
> "sort".  
> > --> Old behavior can be restored by setting "spark.shuffle.manager" to  
> > "hash".  
> >  
> > == How does this differ from RC1 ==  
> > This has fixes for a handful of issues identified - some of the  
> > notable fixes are:  
> >  
> > [Core]  
> > SPARK-4498: Standalone Master can fail to recognize completed/failed  
> > applications  
> >  
> > [SQL]  
> > SPARK-4552: Query for empty parquet table in spark sql hive get  
> > IllegalArgumentException  
> > SPARK-4753: Parquet2 does not prune based on OR filters on partition  
> > columns  
> > SPARK-4761: With JDBC server, set Kryo as default serializer and  
> > disable reference tracking  
> > SPARK-4785: When called with arguments referring column fields, PMOD  
> > throws NPE  
> >  
> > - Patrick  
> >  
> > ---------------------------------------------------------------------  
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org <javascript:;>  
> > For additional commands, e-mail: dev-h...@spark.apache.org  
> <javascript:;>  
> >  
> >  
>

Re: [VOTE] Release Apache Spark 1.2.0 (RC2)

Reply via email to