Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

Sandy Ryza Sun, 08 Mar 2015 22:52:28 -0700

+1 (non-binding, doc and packaging issues aside)

Built from source, ran jobs and spark-shell against a pseudo-distributed
YARN cluster.


On Sun, Mar 8, 2015 at 2:42 PM, Krishna Sankar <[email protected]> wrote:

> Yep, otherwise this will become an N^2 problem - Scala versions X Hadoop
> Distributions X ...
>
> May be one option is to have a minimum basic set (which I know is what we
> are discussing) and move the rest to spark-packages.org. There the vendors
> can add the latest downloads - for example when 1.4 is released, HDP can
> build a release of HDP Spark 1.4 bundle.
>
> Cheers
> <k/>
>
> On Sun, Mar 8, 2015 at 2:11 PM, Patrick Wendell <[email protected]>
> wrote:
>
> > We probably want to revisit the way we do binaries in general for
> > 1.4+. IMO, something worth forking a separate thread for.
> >
> > I've been hesitating to add new binaries because people
> > (understandably) complain if you ever stop packaging older ones, but
> > on the other hand the ASF has complained that we have too many
> > binaries already and that we need to pare it down because of the large
> > volume of files. Doubling the number of binaries we produce for Scala
> > 2.11 seemed like it would be too much.
> >
> > One solution potentially is to actually package "Hadoop provided"
> > binaries and encourage users to use these by simply setting
> > HADOOP_HOME, or have instructions for specific distros. I've heard
> > that our existing packages don't work well on HDP for instance, since
> > there are some configuration quirks that differ from the upstream
> > Hadoop.
> >
> > If we cut down on the cross building for Hadoop versions, then it is
> > more tenable to cross build for Scala versions without exploding the
> > number of binaries.
> >
> > - Patrick
> >
> > On Sun, Mar 8, 2015 at 12:46 PM, Sean Owen <[email protected]> wrote:
> > > Yeah, interesting question of what is the better default for the
> > > single set of artifacts published to Maven. I think there's an
> > > argument for Hadoop 2 and perhaps Hive for the 2.10 build too. Pros
> > > and cons discussed more at
> > >
> > > https://issues.apache.org/jira/browse/SPARK-5134
> > > https://github.com/apache/spark/pull/3917
> > >
> > > On Sun, Mar 8, 2015 at 7:42 PM, Matei Zaharia <[email protected]
> >
> > wrote:
> > >> +1
> > >>
> > >> Tested it on Mac OS X.
> > >>
> > >> One small issue I noticed is that the Scala 2.11 build is using Hadoop
> > 1 without Hive, which is kind of weird because people will more likely
> want
> > Hadoop 2 with Hive. So it would be good to publish a build for that
> > configuration instead. We can do it if we do a new RC, or it might be
> that
> > binary builds may not need to be voted on (I forgot the details there).
> > >>
> > >> Matei
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

Reply via email to