[
https://issues.apache.org/jira/browse/PIG-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157756#comment-14157756
]
Ángel Álvarez commented on PIG-4173:
------------------------------------
I'm getting this error whenever I try to load any file from the HDFS:
2014-10-02 17:44:19,592 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
0: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID
3, tldam4602.lda): java.lang.IllegalStateException: unread block data
java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
It only fails when I run my two-lines script (LOAD+DUMP) on the cluster.
I think ... it might have something to do with my client libraries or its order
...
> Move to Spark 1.x
> -----------------
>
> Key: PIG-4173
> URL: https://issues.apache.org/jira/browse/PIG-4173
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: bc Wong
> Assignee: Richard Ding
> Attachments: PIG-4173.patch, PIG-4173_2.patch, PIG-4173_3.patch,
> TEST-org.apache.pig.spark.TestSpark.txt
>
>
> The Spark branch is using Spark 0.9:
> https://github.com/apache/pig/blob/spark/ivy.xml#L438. We should probably
> switch to Spark 1.x asap, due to Spark interface changes since 1.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)