+1 for java8 only +1 for 2.11+ only . At this point scala libraries supporting only 2.10 are typically less active and/or poorly maintained. That trend will only continue when considering the lifespan of spark 2.X.
2016-03-24 11:32 GMT-07:00 Steve Loughran <ste...@hortonworks.com>: > > On 24 Mar 2016, at 15:27, Koert Kuipers <ko...@tresata.com> wrote: > > i think the arguments are convincing, but it also makes me wonder if i > live in some kind of alternate universe... we deploy on customers clusters, > where the OS, python version, java version and hadoop distro are not chosen > by us. so think centos 6, cdh5 or hdp 2.3, java 7 and python 2.6. we simply > have access to a single proxy machine and launch through yarn. asking them > to upgrade java is pretty much out of the question or a 6+ month ordeal. of > the 10 client clusters i can think of on the top of my head all of them are > on java 7, none are on java 8. so by doing this you would make spark 2 > basically unusable for us (unless most of them have plans of upgrading in > near term to java 8, i will ask around and report back...). > > > > It's not actually mandatory for the process executing in the Yarn cluster > to run with the same JVM as the rest of the Hadoop stack; all that is > needed is for the environment variables to set up the JAVA_HOME and PATH. > Switching JVMs not something which YARN makes it easy to do, but it may be > possible, especially if Spark itself provides some hooks, so you don't have > to manually lay with setting things up. That may be something which could > significantly ease adoption of Spark 2 in YARN clusters. Same for Python. > > This is something I could probably help others to address > >