oh i forgot in step1 you will have to modify spark's pom.xml to include cloudera repo so it can find the cloudera artifacts
anyhow we found this process to be pretty easy and we stopped using the spark versions bundles with the distros On Mon, Sep 26, 2016 at 3:57 PM, Koert Kuipers <ko...@tresata.com> wrote: > it is also easy to launch many different spark versions on yarn by simply > having them installed side-by-side. > > 1) build spark for your cdh version. for example for cdh 5 i do: > $ git checkout v2.0.0 > $ dev/make-distribution.sh --name cdh5.4-hive --tgz -Phadoop-2.6 > -Dhadoop.version=2.6.0-cdh5.4.4 -Pyarn -Phive -Psparkr > > 2) scp to and untar the spark tar.gz on the server you want to launch from > > 3) modify the new spark's conf/spark-env.sh so it has this: > export HADOOP_CONF_DIR=/etc/hadoop/conf > > 4) modify the new spark's conf/spark-defaults.conf so it has this: > spark.master yarn > > 5) now launch your application with the bin/spark-submit script from the > new spark distro > > > On Mon, Sep 26, 2016 at 11:48 AM, Rex X <dnsr...@gmail.com> wrote: > >> Yes, I have a cloudera cluster with Yarn. Any more details on how to work >> out with uber jar? >> >> Thank you. >> >> >> On Sun, Sep 18, 2016 at 2:13 PM, Felix Cheung <felixcheun...@hotmail.com> >> wrote: >> >>> Well, uber jar works in YARN, but not with standalone ;) >>> >>> >>> >>> >>> >>> On Sun, Sep 18, 2016 at 12:44 PM -0700, "Chris Fregly" <ch...@fregly.com >>> > wrote: >>> >>> you'll see errors like this... >>> >>> "java.lang.RuntimeException: java.io.InvalidClassException: >>> org.apache.spark.rpc.netty.RequestMessage; local class incompatible: >>> stream classdesc serialVersionUID = -2221986757032131007, local class >>> serialVersionUID = -5447855329526097695" >>> >>> ...when mixing versions of spark. >>> >>> i'm actually seeing this right now while testing across Spark 1.6.1 and >>> Spark 2.0.1 for my all-in-one, hybrid cloud/on-premise Spark + Zeppelin + >>> Kafka + Kubernetes + Docker + One-Click Spark ML Model Production >>> Deployments initiative documented here: >>> >>> https://github.com/fluxcapacitor/pipeline/wiki/Kubernetes-Do >>> cker-Spark-ML >>> >>> and check out my upcoming meetup on this effort either in-person or >>> online: >>> >>> http://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup/e >>> vents/233978839/ >>> >>> we're throwing in some GPU/CUDA just to sweeten the offering! :) >>> >>> On Sat, Sep 10, 2016 at 2:57 PM, Holden Karau <hol...@pigscanfly.ca> >>> wrote: >>> >>>> I don't think a 2.0 uber jar will play nicely on a 1.5 standalone >>>> cluster. >>>> >>>> >>>> On Saturday, September 10, 2016, Felix Cheung < >>>> felixcheun...@hotmail.com> wrote: >>>> >>>>> You should be able to get it to work with 2.0 as uber jar. >>>>> >>>>> What type cluster you are running on? YARN? And what distribution? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Sep 4, 2016 at 8:48 PM -0700, "Holden Karau" < >>>>> hol...@pigscanfly.ca> wrote: >>>>> >>>>> You really shouldn't mix different versions of Spark between the >>>>> master and worker nodes, if your going to upgrade - upgrade all of them. >>>>> Otherwise you may get very confusing failures. >>>>> >>>>> On Monday, September 5, 2016, Rex X <dnsr...@gmail.com> wrote: >>>>> >>>>>> Wish to use the Pivot Table feature of data frame which is available >>>>>> since Spark 1.6. But the spark of current cluster is version 1.5. Can we >>>>>> install Spark 2.0 on the master node to work around this? >>>>>> >>>>>> Thanks! >>>>>> >>>>> >>>>> >>>>> -- >>>>> Cell : 425-233-8271 >>>>> Twitter: https://twitter.com/holdenkarau >>>>> >>>>> >>>> >>>> -- >>>> Cell : 425-233-8271 >>>> Twitter: https://twitter.com/holdenkarau >>>> >>>> >>> >>> >>> -- >>> *Chris Fregly* >>> Research Scientist @ *PipelineIO* <http://pipeline.io> >>> *Advanced Spark and TensorFlow Meetup* >>> <http://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup/> >>> *San Francisco* | *Chicago* | *Washington DC* >>> >>> >>> >>> >>> >> >