(Copying my reply since I don't know if it goes to the mailing list)

Great, thanks for explaining the reasoning. You're saying these aren't
going into the final release? I think that moots any issue surrounding
distributing them then.

This is all I know of from the ASF:
https://community.apache.org/projectIndependence.html I don't read it
as expressly forbidding this kind of thing although you can see how it
bumps up against the spirit. There's not a bright line -- what about
Tomcat providing binaries compiled for Windows for example? does that
favor an OS vendor?

>From this technical ASF perspective only the releases matter -- do
what you want with snapshots and RCs. The only issue there is maybe
releasing something different than was in the RC; is that at all
confusing? Just needs a note.

I think this theoretical issue doesn't exist if these binaries aren't
released, so I see no reason to not proceed.

The rest is a different question about whether you want to spend time
maintaining this profile and candidate. The vendor already manages
their build I think and -- and I don't know -- may even prefer not to
have a different special build floating around. There's also the
theoretical argument that this turns off other vendors from adopting
Spark if it's perceived to be too connected to other vendors. I'd like
to maximize Spark's distribution and there's some argument you do this
by not making vendor profiles. But as I say a different question to
just think about over time...

(oh and PS for my part I think it's a good thing that CDH4 binaries
were removed. I wasn't arguing for resurrecting them)

On Fri, Aug 29, 2014 at 7:26 AM, Patrick Wendell <pwend...@gmail.com> wrote:
> Hey Sean,
>
> The reason there are no longer CDH-specific builds is that all newer
> versions of CDH and HDP work with builds for the upstream Hadoop
> projects. I dropped CDH4 in favor of a  newer Hadoop version (2.4) and
> the Hadoop-without-Hive (also 2.4) build.
>
> For MapR - we can't officially post those artifacts on ASF web space
> when we make the final release, we can only link to them as being
> hosted by MapR specifically since they use non-compatible licenses.
> However, I felt that providing these during a testing period was
> alright, with the goal of increasing test coverage. I couldn't find
> any policy against posting these on personal web space during RC
> voting. However, we can remove them if there is one.
>
> Dropping CDH4 was more because it is now pretty old, but we can add it
> back if people want. The binary packaging is a slightly separate
> question from release votes, so I can always add more binary packages
> whenever. And on this, my main concern is covering the most popular
> Hadoop versions to lower the bar for users to build and test Spark.
>
> - Patrick
>
> On Thu, Aug 28, 2014 at 11:04 PM, Sean Owen <so...@cloudera.com> wrote:
>> +1 I tested the source and Hadoop 2.4 release. Checksums and
>> signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't
>> fail any more than usual.
>>
>> FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another
>> project and have encountered no problems.
>>
>>
>> I notice that the 1.1.0 release removes the CDH4-specific build, but
>> adds two MapR-specific builds. Compare with
>> https://dist.apache.org/repos/dist/release/spark/spark-1.0.2/ I
>> commented on the commit:
>> https://github.com/apache/spark/commit/ceb19830b88486faa87ff41e18d03ede713a73cc
>>
>> I'm in favor of removing all vendor-specific builds. This change
>> *looks* a bit funny as there was no JIRA (?) and appears to swap one
>> vendor for another. Of course there's nothing untoward going on, but
>> what was the reasoning? It's best avoided, and MapR already
>> distributes Spark just fine, no?
>>
>> This is a gray area with ASF projects. I mention it as well because it
>> came up with Apache Flink recently
>> (http://mail-archives.eu.apache.org/mod_mbox/incubator-flink-dev/201408.mbox/%3CCANC1h_u%3DN0YKFu3pDaEVYz5ZcQtjQnXEjQA2ReKmoS%2Bye7%3Do%3DA%40mail.gmail.com%3E)
>> Another vendor rightly noted this could look like favoritism. They
>> changed to remove vendor releases.
>>
>> On Fri, Aug 29, 2014 at 3:14 AM, Patrick Wendell <pwend...@gmail.com> wrote:
>>> Please vote on releasing the following candidate as Apache Spark version 
>>> 1.1.0!
>>>
>>> The tag to be voted on is v1.1.0-rc2 (commit 711aebb3):
>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=711aebb329ca28046396af1e34395a0df92b5327
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://people.apache.org/~pwendell/spark-1.1.0-rc2/
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1029/
>>>
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-1.1.0-rc2-docs/
>>>
>>> Please vote on releasing this package as Apache Spark 1.1.0!
>>>
>>> The vote is open until Monday, September 01, at 03:11 UTC and passes if
>>> a majority of at least 3 +1 PMC votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 1.1.0
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>>
>>> == Regressions fixed since RC1 ==
>>> LZ4 compression issue: https://issues.apache.org/jira/browse/SPARK-3277
>>>
>>> == What justifies a -1 vote for this release? ==
>>> This vote is happening very late into the QA period compared with
>>> previous votes, so -1 votes should only occur for significant
>>> regressions from 1.0.2. Bugs already present in 1.0.X will not block
>>> this release.
>>>
>>> == What default changes should I be aware of? ==
>>> 1. The default value of "spark.io.compression.codec" is now "snappy"
>>> --> Old behavior can be restored by switching to "lzf"
>>>
>>> 2. PySpark now performs external spilling during aggregations.
>>> --> Old behavior can be restored by setting "spark.shuffle.spill" to 
>>> "false".
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to