Spark 2.x has to be the time for Java 8.

I'd rather increase JVM major version on a Spark major version than on a
Spark minor version, and I'd rather Spark do that upgrade for the 2.x
series than the 3.x series (~2yr from now based on the lifetime of Spark
1.x).  If we wait until the next opportunity for a breaking change to Spark
(3.x) we might be upgrading to Java 9 at that point rather than Java 8.

If Spark users need Java 7 they are free to continue using the 1.x series,
the same way that folks who need Java 6 are free to continue using 1.4

On Thu, Mar 24, 2016 at 11:46 AM, Stephen Boesch <java...@gmail.com> wrote:

> +1 for java8 only   +1 for 2.11+ only .    At this point scala libraries
> supporting only 2.10 are typically less active and/or poorly maintained.
> That trend will only continue when considering the lifespan of spark 2.X.
>
> 2016-03-24 11:32 GMT-07:00 Steve Loughran <ste...@hortonworks.com>:
>
>>
>> On 24 Mar 2016, at 15:27, Koert Kuipers <ko...@tresata.com> wrote:
>>
>> i think the arguments are convincing, but it also makes me wonder if i
>> live in some kind of alternate universe... we deploy on customers clusters,
>> where the OS, python version, java version and hadoop distro are not chosen
>> by us. so think centos 6, cdh5 or hdp 2.3, java 7 and python 2.6. we simply
>> have access to a single proxy machine and launch through yarn. asking them
>> to upgrade java is pretty much out of the question or a 6+ month ordeal. of
>> the 10 client clusters i can think of on the top of my head all of them are
>> on java 7, none are on java 8. so by doing this you would make spark 2
>> basically unusable for us (unless most of them have plans of upgrading in
>> near term to java 8, i will ask around and report back...).
>>
>>
>>
>> It's not actually mandatory for the process executing in the Yarn cluster
>> to run with the same JVM as the rest of the Hadoop stack; all that is
>> needed is for the environment variables to set up the JAVA_HOME and PATH.
>> Switching JVMs not something which YARN makes it easy to do, but it may be
>> possible, especially if Spark itself provides some hooks, so you don't have
>> to manually lay with setting things up. That may be something which could
>> significantly ease adoption of Spark 2 in YARN clusters. Same for Python.
>>
>> This is something I could probably help others to address
>>
>>
>

Reply via email to