Hi: I have an issue with oozie run sparkR, could you please help me? I try to run sparkR job through oozie in yarn-client mode. And I have installed R package in all my nodes.
job.properties is like: nameNode=hdfs://XXX:8020 jobTracker=XXX:8050 master=yarn-client queueName=default oozie.use.system.libpath=true oozie.wf.application.path=/user/oozie/measurecountWF The workflow is like: <workflow-app xmlns='uri:oozie:workflow:0.5' name='measurecountWF'> <global> <configuration> <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=XXXX</value> </property> </configuration> </global> <start to="sparkAction"/> <action name="sparkAction"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <master>${master}</master> <name>measurecountWF</name> <jar>measurecount.R</jar> <spark-opts>--conf spark.driver.extraJavaOptions=XXXX</spark-opts> </spark> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}] </message> </kill> <end name="end"/> </workflow-app> It failed with class not found exception. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, XXXX): java.lang.ClassNotFoundException: com.cloudant.spark.common.JsonStoreRDDPartition at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.defaultReadFields(ObjectInpu Calls: sql -> callJMethod -> invokeJava Execution halted Intercepting System.exit(1) Does oozie support run sparkR in spark action? Or we should only wrap it in ssh action? Thanks a lot