[ https://issues.apache.org/jira/browse/HIVE-8300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152801#comment-14152801 ]
Brock Noland commented on HIVE-8300: ------------------------------------ I guess we need to revert this commented line here: https://github.com/apache/hive/blob/spark/ql/pom.xml#L601 as it was only there because spark had it's own version of guava. See trunk here: https://github.com/apache/hive/blob/trunk/ql/pom.xml#L632 > Missing guava lib causes IllegalStateException when deserializing a task > [Spark Branch] > --------------------------------------------------------------------------------------- > > Key: HIVE-8300 > URL: https://issues.apache.org/jira/browse/HIVE-8300 > Project: Hive > Issue Type: Bug > Components: Spark > Environment: Spark-1.2.0-SNAPSHOT > Reporter: Rui Li > > In spark-1.2, we have guava shaded in spark-assembly. And we only ship > hive-exec to spark cluster. So spark executor won't have (original) guava in > its class path. > This can cause some problem when TaskRunner deserializes a task, and throws > something like this: > {code} > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 > (TID 3, node13-1): java.lang.IllegalStateException: unread block data > > java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) > > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) > > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:164) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:744) > {code} > We may have to verify this issue and ship guava to spark cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)