Re: A proposal for Spark 2.0

Sean Owen Wed, 11 Nov 2015 01:59:37 -0800

On Wed, Nov 11, 2015 at 12:10 AM, Reynold Xin <r...@databricks.com> wrote:
> to the Spark community. A major release should not be very different from a
> minor release and should not be gated based on new features. The main
> purpose of a major release is an opportunity to fix things that are broken
> in the current API and remove certain deprecated APIs (examples follow).


Agree with this stance. Generally, a major release might also be a
time to replace some big old API or implementation with a new one, but
I don't see obvious candidates.

I wouldn't mind turning attention to 2.x sooner than later, unless
there's a fairly good reason to continue adding features in 1.x to a
1.7 release. The scope as of 1.6 is already pretty darned big.


> 1. Scala 2.11 as the default build. We should still support Scala 2.10, but
> it has been end-of-life.

By the time 2.x rolls around, 2.12 will be the main version, 2.11 will
be quite stable, and 2.10 will have been EOL for a while. I'd propose
dropping 2.10. Otherwise it's supported for 2 more years.


> 2. Remove Hadoop 1 support.

I'd go further to drop support for <2.2 for sure (2.0 and 2.1 were
sort of 'alpha' and 'beta' releases) and even <2.6.

I'm sure we'll think of a number of other small things -- shading a
bunch of stuff? reviewing and updating dependencies in light of
simpler, more recent dependencies to support from Hadoop etc?

Farming out Tachyon to a module? (I felt like someone proposed this?)
Pop out any Docker stuff to another repo?
Continue that same effort for EC2?
Farming out some of the "external" integrations to another repo (?
controversial)

See also anything marked version "2+" in JIRA.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: A proposal for Spark 2.0

Reply via email to