i dont see how we can be close to the point where we dont need to support
hadoop 2.x. this does not agree with the reality from my perspective, which
is that all our clients are on hadoop 2.x. not a single one is on hadoop
3.x currently. this includes deployments of cloudera distros, hortonworks
distros, and cloud distros like emr and dataproc.

forcing us to be on older spark versions would be unfortunate for us, and
also bad for the community (as deployments like ours help find bugs in
spark).

On Mon, Oct 28, 2019 at 3:51 PM Sean Owen <sro...@gmail.com> wrote:

> I'm OK with that, but don't have a strong opinion nor info about the
> implications.
> That said my guess is we're close to the point where we don't need to
> support Hadoop 2.x anyway, so, yeah.
>
> On Mon, Oct 28, 2019 at 2:33 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
> >
> > Hi, All.
> >
> > There was a discussion on publishing artifacts built with Hadoop 3 .
> > But, we are still publishing with Hadoop 2.7.3 and `3.0-preview` will be
> the same because we didn't change anything yet.
> >
> > Technically, we need to change two places for publishing.
> >
> > 1. Jenkins Snapshot Publishing
> >
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/
> >
> > 2. Release Snapshot/Release Publishing
> >
> https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh
> >
> > To minimize the change, we need to switch our default Hadoop profile.
> >
> > Currently, the default is `hadoop-2.7 (2.7.4)` profile and `hadoop-3.2
> (3.2.0)` is optional.
> > We had better use `hadoop-3.2` profile by default and `hadoop-2.7`
> optionally.
> >
> > Note that this means we use Hive 2.3.6 by default. Only `hadoop-2.7`
> distribution will use `Hive 1.2.1` like Apache Spark 2.4.x.
> >
> > Bests,
> > Dongjoon.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to