Hi:
I have an issue with oozie run sparkR, could you please help me?
I try to run sparkR job through oozie in yarn-client mode. And I have
installed R package in all my nodes.

job.properties is like:
nameNode=hdfs://XXX:8020
jobTracker=XXX:8050
master=yarn-client
queueName=default
oozie.use.system.libpath=true
oozie.wf.application.path=/user/oozie/measurecountWF

The workflow is like:
<workflow-app xmlns='uri:oozie:workflow:0.5' name='measurecountWF'>
<global>
            <configuration>
                <property>
                    <name>oozie.launcher.yarn.app.mapreduce.am.env</name>
                    <value>SPARK_HOME=XXXX</value>
                </property>
            </configuration>
</global>
<start to="sparkAction"/>
    <action name="sparkAction">
        <spark xmlns="uri:oozie:spark-action:0.1">
                <job-tracker>${jobTracker}</job-tracker>
                <name-node>${nameNode}</name-node>
                <master>${master}</master>
                <name>measurecountWF</name>
                <jar>measurecount.R</jar>
         <spark-opts>--conf spark.driver.extraJavaOptions=XXXX</spark-opts>
         </spark>
<ok to="end"/>
      <error to="fail"/>
  </action>
  <kill name="fail">
        <message>Workflow failed, error
        message[${wf:errorMessage(wf:lastErrorNode())}]
        </message>
  </kill>
  <end name="end"/>
</workflow-app>

It failed with class not found exception.

org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3
in stage 0.0 (TID 3, XXXX): java.lang.ClassNotFoundException:
com.cloudant.spark.common.JsonStoreRDDPartition
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
        at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
        at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
        at java.io.ObjectInputStream.defaultReadFields(ObjectInpu
Calls: sql -> callJMethod -> invokeJava
Execution halted
Intercepting System.exit(1)

Does oozie support run sparkR in spark action? Or we should only wrap
it in ssh action?

Thanks a lot

Reply via email to