[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15744366#comment-15744366 ]
Rui Li commented on HIVE-13278: ------------------------------- Hi [~xuefuz], the conclusion is we somehow try to read reduce.xml for map-only job, and yes it happens to MR as well. The call path is {{HiveOutputFormatImpl.checkOutputSpecs -> Utilities.getMapRedWork}}. The reason why HiveOutputFormatImpl needs to get the MapRedWork is it needs to do some check on all the FS operators. Since FS only exists at the end of a job, my suggestion is we firstly try to get MapWork. If the MapWork has an FS in it, it means this is a map-only job so we don't have to look for ReduceWork. But [~stakiar] found that some map-only job may not have FS in the MapWork, e.g. {{ANALYZE TABLE}}. To have a complete fix, we'll need some flag in the JobConf indicating if this is map-only. Or we can use my solution, which solves the issue for most cases. Some special handling for HoS may be needed. For HoS, each map.xml and reduce.xml resides in a different path. We can use {{mapred.task.is.map}} to determine whether the JobConf is for MapWork or ReduceWork. And then call getMapWork or getReduceWork respectively. > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 > Reporter: Xin Hao > Assignee: Sahil Takiar > Priority: Minor > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark. > Certainly, it doesn't prevent the query from running successfully. So mark it > as Minor currently. > Error message example: > {noformat} > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)