Have you tried the following ? --conf spark.driver.userClassPathFirst=true --conf spark.executor. userClassPathFirst=true
On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg <[email protected] > wrote: > Release of Spark: 1.5.0. > > Command line invokation: > > ACME_INGEST_HOME=/mnt/acme/acme-ingest > ACME_INGEST_VERSION=0.0.1-SNAPSHOT > ACME_BATCH_DURATION_MILLIS=5000 > SPARK_MASTER_URL=spark://data1:7077 > JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000" > JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g" > > $SPARK_HOME/bin/spark-submit \ > --driver-class-path $ACME_INGEST_HOME \ > --driver-java-options "$JAVA_OPTIONS" \ > --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \ > --master $SPARK_MASTER_URL \ > --conf > "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" > \ > > $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \ > -brokerlist $METADATA_BROKER_LIST \ > -topic acme.topic1 \ > -autooffsetreset largest \ > -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \ > -appname Acme.App1 \ > -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1 > Note that SolrException is definitely in our consumer jar > acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to > $ACME_INGEST_HOME. > > For the extraClassPath on the executors, we've got additionally > hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the > Spark jobs to communicate with HBase. The only way to force Phoenix to > successfully communicate with HBase was to have that JAR explicitly added > to the executor classpath regardless of the fact that the contents of the > hbase-protocol hadoop jar get rolled up into the consumer jar at build time. > > I'm starting to wonder whether there's some class loading pattern here > where some classes may not get loaded out of the consumer jar and therefore > have to have their respective jars added to the executor extraClassPath? > > Or is this a serialization problem for SolrException as Divya > Ravichandran suggested? > > > > > On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu <[email protected]> wrote: > >> Mind providing a bit more information: >> >> release of Spark >> command line for running Spark job >> >> Cheers >> >> On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg < >> [email protected]> wrote: >> >>> We're seeing this occasionally. Granted, this was caused by a wrinkle in >>> the Solr schema but this bubbled up all the way in Spark and caused job >>> failures. >>> >>> I just checked and SolrException class is actually in the consumer job >>> jar we use. Is there any reason why Spark cannot find the SolrException >>> class? >>> >>> 15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception >>> could not be deserialized >>> java.lang.ClassNotFoundException: org.apache.solr.common.SolrException >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> at java.lang.Class.forName0(Native Method) >>> at java.lang.Class.forName(Class.java:348) >>> at >>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) >>> at >>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) >>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >>> at >>> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:497) >>> at >>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at >>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000) >>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924) >>> at >>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) >>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) >>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) >>> at >>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) >>> at >>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) >>> at >>> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) >>> at >>> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) >>> at >>> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) >>> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >>> at >>> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> >> >> >
