I actually talked quite a bit today with an engineer on the scala compiler
team tonight and the scala 2.10 + java 8 combo should be ok. The latest
Scala 2.10 release should have all the important fixes that are needed for
Java 8.

On Thu, Mar 24, 2016 at 1:01 AM, Sean Owen <so...@cloudera.com> wrote:

> I generally favor this for the simplification. I didn't realize there
> were actually some performance wins and important bug fixes.
>
> I've had lots of trouble with scalac 2.10 + Java 8. I don't know if
> it's still a problem since 2.11 + 8 seems OK, but for a long time the
> sql/ modules would never compile in this config. If it's actually
> required for 2.12, makes sense.
>
> As ever my general stance is that nobody has to make a major-version
> upgrade; Spark 1.6 does not stop working for those that need Java 7. I
> also think it's reasonable for anyone to expect that major-version
> upgrades require major-version dependency updates. Also remember that
> not removing Java 7 support means committing to it here for a couple
> more years. It's not just about the situation on release day.
>
> On Thu, Mar 24, 2016 at 8:27 AM, Reynold Xin <r...@databricks.com> wrote:
> > About a year ago we decided to drop Java 6 support in Spark 1.5. I am
> > wondering if we should also just drop Java 7 support in Spark 2.0 (i.e.
> > Spark 2.0 would require Java 8 to run).
> >
> > Oracle ended public updates for JDK 7 in one year ago (Apr 2015), and
> > removed public downloads for JDK 7 in July 2015. In the past I've
> actually
> > been against dropping Java 8, but today I ran into an issue with the new
> > Dataset API not working well with Java 8 lambdas, and that changed my
> > opinion on this.
> >
> > I've been thinking more about this issue today and also talked with a lot
> > people offline to gather feedback, and I actually think the pros
> outweighs
> > the cons, for the following reasons (in some rough order of importance):
> >
> > 1. It is complicated to test how well Spark APIs work for Java lambdas
> if we
> > support Java 7. Jenkins machines need to have both Java 7 and Java 8
> > installed and we must run through a set of test suites in 7, and then the
> > lambda tests in Java 8. This complicates build environments/scripts, and
> > makes them less robust. Without good testing infrastructure, I have no
> > confidence in building good APIs for Java 8.
> >
> > 2. Dataset/DataFrame performance will be between 1x to 10x slower in
> Java 7.
> > The primary APIs we want users to use in Spark 2.x are Dataset/DataFrame,
> > and this impacts pretty much everything from machine learning to
> structured
> > streaming. We have made great progress in their performance through
> > extensive use of code generation. (In many dimensions Spark 2.0 with
> > DataFrames/Datasets looks more like a compiler than a MapReduce or query
> > engine.) These optimizations don't work well in Java 7 due to broken code
> > cache flushing. This problem has been fixed by Oracle in Java 8. In
> > addition, Java 8 comes with better support for Unsafe and SIMD.
> >
> > 3. Scala 2.12 will come out soon, and we will want to add support for
> that.
> > Scala 2.12 only works on Java 8. If we do support Java 7, we'd have a
> fairly
> > complicated compatibility matrix and testing infrastructure.
> >
> > 4. There are libraries that I've looked into in the past that support
> only
> > Java 8. This is more common in high performance libraries such as Aeron
> (a
> > messaging library). Having to support Java 7 means we are not able to use
> > these. It is not that big of a deal right now, but will become
> increasingly
> > more difficult as we optimize performance.
> >
> >
> > The downside of not supporting Java 7 is also obvious. Some organizations
> > are stuck with Java 7, and they wouldn't be able to use Spark 2.0 without
> > upgrading Java.
> >
> >
>

Reply via email to