[ https://issues.apache.org/jira/browse/HIVE-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066120#comment-14066120 ]
Rui Li commented on HIVE-7431: ------------------------------ I also noted that when running on Tez, MapWork is cached in an ObjectCache. Now spark mode retrieves the MapWork from the plan file for each task. Not sure if this could be related to the issue here? > When run on spark cluster, some spark tasks may fail > ---------------------------------------------------- > > Key: HIVE-7431 > URL: https://issues.apache.org/jira/browse/HIVE-7431 > Project: Hive > Issue Type: Bug > Reporter: Rui Li > > When running queries on spark, some spark tasks fail (usually the first > couple of tasks) with the following stack trace: > {quote} > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154) > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:60) > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:35) > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:161) > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:161) > org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559) > org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559) > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158) > ... > {quote} > Observed for spark standalone cluster. Not verified for spark on yarn or > mesos. > NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)