Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

Jeremy Freeman Tue, 02 Dec 2014 10:17:45 -0800

+1 (non-binding)

Installed version pre-built for Hadoop on a private HPC
ran PySpark shell w/ iPython
loaded data using custom Hadoop input formats
ran MLlib routines in PySpark
ran custom workflows in PySpark
browsed the web UI


Noticeable improvements in stability and performance during large shuffles (as 
well as the elimination of frequent but unpredictable “FileNotFound / too many 
open files” errors).

We initially hit errors during large collects that ran fine in 1.1, but setting 
the new spark.driver.maxResultSize to 0 preserved the old behavior. Definitely 
worth highlighting this setting in the release notes, as the new default may be 
too small for some users and workloads.

— Jeremy

-------------------------
jeremyfreeman.net
@thefreemanlab

On Dec 2, 2014, at 3:22 AM, Denny Lee <denny.g....@gmail.com> wrote:

> +1 (non-binding)
> 
> Verified on OSX 10.10.2, built from source,
> spark-shell / spark-submit jobs
> ran various simple Spark / Scala queries
> ran various SparkSQL queries (including HiveContext)
> ran ThriftServer service and connected via beeline
> ran SparkSVD
> 
> 
> On Mon Dec 01 2014 at 11:09:26 PM Patrick Wendell <pwend...@gmail.com>
> wrote:
> 
>> Hey All,
>> 
>> Just an update. Josh, Andrew, and others are working to reproduce
>> SPARK-4498 and fix it. Other than that issue no serious regressions
>> have been reported so far. If we are able to get a fix in for that
>> soon, we'll likely cut another RC with the patch.
>> 
>> Continued testing of RC1 is definitely appreciated!
>> 
>> I'll leave this vote open to allow folks to continue posting comments.
>> It's fine to still give "+1" from your own testing... i.e. you can
>> assume at this point SPARK-4498 will be fixed before releasing.
>> 
>> - Patrick
>> 
>> On Mon, Dec 1, 2014 at 3:30 PM, Matei Zaharia <matei.zaha...@gmail.com>
>> wrote:
>>> +0.9 from me. Tested it on Mac and Windows (someone has to do it) and
>> while things work, I noticed a few recent scripts don't have Windows
>> equivalents, namely https://issues.apache.org/jira/browse/SPARK-4683 and
>> https://issues.apache.org/jira/browse/SPARK-4684. The first one at least
>> would be good to fix if we do another RC. Not blocking the release but
>> useful to fix in docs is https://issues.apache.org/jira/browse/SPARK-4685.
>>> 
>>> Matei
>>> 
>>> 
>>>> On Dec 1, 2014, at 11:18 AM, Josh Rosen <rosenvi...@gmail.com> wrote:
>>>> 
>>>> Hi everyone,
>>>> 
>>>> There's an open bug report related to Spark standalone which could be a
>> potential release-blocker (pending investigation / a bug fix):
>> https://issues.apache.org/jira/browse/SPARK-4498.  This issue seems
>> non-deterministc and only affects long-running Spark standalone
>> deployments, so it may be hard to reproduce.  I'm going to work on a patch
>> to add additional logging in order to help with debugging.
>>>> 
>>>> I just wanted to give an early head's up about this issue and to get
>> more eyes on it in case anyone else has run into it or wants to help with
>> debugging.
>>>> 
>>>> - Josh
>>>> 
>>>> On November 28, 2014 at 9:18:09 PM, Patrick Wendell (pwend...@gmail.com)
>> wrote:
>>>> 
>>>> Please vote on releasing the following candidate as Apache Spark
>> version 1.2.0!
>>>> 
>>>> The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1):
>>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=
>> 1056e9ec13203d0c51564265e94d77a054498fdb
>>>> 
>>>> The release files, including signatures, digests, etc. can be found at:
>>>> http://people.apache.org/~pwendell/spark-1.2.0-rc1/
>>>> 
>>>> Release artifacts are signed with the following key:
>>>> https://people.apache.org/keys/committer/pwendell.asc
>>>> 
>>>> The staging repository for this release can be found at:
>>>> https://repository.apache.org/content/repositories/orgapachespark-1048/
>>>> 
>>>> The documentation corresponding to this release can be found at:
>>>> http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/
>>>> 
>>>> Please vote on releasing this package as Apache Spark 1.2.0!
>>>> 
>>>> The vote is open until Tuesday, December 02, at 05:15 UTC and passes
>>>> if a majority of at least 3 +1 PMC votes are cast.
>>>> 
>>>> [ ] +1 Release this package as Apache Spark 1.1.0
>>>> [ ] -1 Do not release this package because ...
>>>> 
>>>> To learn more about Apache Spark, please see
>>>> http://spark.apache.org/
>>>> 
>>>> == What justifies a -1 vote for this release? ==
>>>> This vote is happening very late into the QA period compared with
>>>> previous votes, so -1 votes should only occur for significant
>>>> regressions from 1.0.2. Bugs already present in 1.1.X, minor
>>>> regressions, or bugs related to new features will not block this
>>>> release.
>>>> 
>>>> == What default changes should I be aware of? ==
>>>> 1. The default value of "spark.shuffle.blockTransferService" has been
>>>> changed to "netty"
>>>> --> Old behavior can be restored by switching to "nio"
>>>> 
>>>> 2. The default value of "spark.shuffle.manager" has been changed to
>> "sort".
>>>> --> Old behavior can be restored by setting "spark.shuffle.manager" to
>> "hash".
>>>> 
>>>> == Other notes ==
>>>> Because this vote is occurring over a weekend, I will likely extend
>>>> the vote if this RC survives until the end of the vote period.
>>>> 
>>>> - Patrick
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>> 
>>

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

Reply via email to