Hi, I'm running a very simple job (textFile->map->groupby->count) and hitting this with Spark 1.6.0 on EMR 4.3 (Hadoop 2.7.1) and hitting this exception when running on yarn-client and not in local mode:
16/05/11 10:29:26 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, ip-172-31-33-97.ec2.internal, partition 0,NODE_LOCAL, 15116 bytes) 16/05/11 10:29:26 WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1, ip-172-31-33-97.ec2.internal): java.lang.ClassCastException: org.apache.spark.util.SerializableConfiguration cannot be cast to [B at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) If found a jira that relates to streaming and accumulators but I'm using neither. Any ideas ? Should I file a jira? Thank you, Daniel