Hi Todd, Yes, those entries were present in the conf under the same SPARK_HOME that was used to run spark-submit. On a related note, I'm assuming that the additional spark yarn options (like spark.yarn.jar) need to be set in the same properties file that is passed to spark-submit. That apart, I assume that no other host on the cluster should require a "deployment of" the spark distribution or any other config change to support a spark job. Isn't that correct?
On Tue, Mar 17, 2015 at 6:19 PM, Todd Nist <tsind...@gmail.com> wrote: > Hi Bharath, > > Do you have these entries in your $SPARK_HOME/conf/spark-defaults.conf > file? > > spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041 > spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041 > > > > > On Tue, Mar 17, 2015 at 1:04 AM, Bharath Ravi Kumar <reachb...@gmail.com> > wrote: > >> Still no luck running purpose-built 1.3 against HDP 2.2 after following >> all the instructions. Anyone else faced this issue? >> >> On Mon, Mar 16, 2015 at 8:53 PM, Bharath Ravi Kumar <reachb...@gmail.com> >> wrote: >> >>> Hi Todd, >>> >>> Thanks for the help. I'll try again after building a distribution with >>> the 1.3 sources. However, I wanted to confirm what I mentioned earlier: is >>> it sufficient to copy the distribution only to the client host from where >>> spark-submit is invoked(with spark.yarn.jar set), or is there a need to >>> ensure that the entire distribution is deployed made available pre-deployed >>> on every host in the yarn cluster? I'd assume that the latter shouldn't be >>> necessary. >>> >>> On Mon, Mar 16, 2015 at 8:38 PM, Todd Nist <tsind...@gmail.com> wrote: >>> >>>> Hi Bharath, >>>> >>>> I ran into the same issue a few days ago, here is a link to a post on >>>> Horton's fourm. >>>> http://hortonworks.com/community/forums/search/spark+1.2.1/ >>>> >>>> Incase anyone else needs to perform this these are the steps I took to >>>> get it to work with Spark 1.2.1 as well as Spark 1.3.0-RC3: >>>> >>>> 1. Pull 1.2.1 Source >>>> 2. Apply the following patches >>>> a. Address jackson version, https://github.com/apache/spark/pull/3938 >>>> b. Address the propagation of the hdp.version set in the >>>> spark-default.conf, https://github.com/apache/spark/pull/3409 >>>> 3. build with $SPARK_HOME./make-distribution.sh –name hadoop2.6 –tgz >>>> -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver >>>> -DskipTests package >>>> >>>> Then deploy the resulting artifact => spark-1.2.1-bin-hadoop2.6.tgz >>>> following instructions in the HDP Spark preview >>>> http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/ >>>> >>>> FWIW spark-1.3.0 appears to be working fine with HDP as well and steps >>>> 2a and 2b are not required. >>>> >>>> HTH >>>> >>>> -Todd >>>> >>>> On Mon, Mar 16, 2015 at 10:13 AM, Bharath Ravi Kumar < >>>> reachb...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> Trying to run spark ( 1.2.1 built for hdp 2.2) against a yarn cluster >>>>> results in the AM failing to start with following error on stderr: >>>>> Error: Could not find or load main class >>>>> org.apache.spark.deploy.yarn.ExecutorLauncher >>>>> An application id was assigned to the job, but there were no logs. Note >>>>> that the spark distribution has not been "installed" on every host in the >>>>> cluster and the aforementioned spark build was copied to one of the >>>>> hadoop client hosts in the cluster to launch the >>>>> job. Spark-submit was run with --master yarn-client and spark.yarn.jar >>>>> was set to the assembly jar from the above distribution. Switching the >>>>> spark distribution to the HDP recommended version >>>>> and following the instructions on this page >>>>> <http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/> did not >>>>> fix the problem either. Any idea what may have caused this error ? >>>>> >>>>> Thanks, >>>>> Bharath >>>>> >>>>> >>>> >>> >> >