[ https://issues.apache.org/jira/browse/HIVE-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658874#comment-14658874 ]
Daniel Dai commented on HIVE-11456: ----------------------------------- Though test case is missing, it might be costly to add, since it involves upgrade Pig dependency to 0.15, and run Pig in Tez mode. The changes look simple and doesn't cause regression. So +1. > HCatStorer should honor mapreduce.output.basename > ------------------------------------------------- > > Key: HIVE-11456 > URL: https://issues.apache.org/jira/browse/HIVE-11456 > Project: Hive > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Rohini Palaniswamy > Assignee: Mithun Radhakrishnan > Priority: Critical > Attachments: HIVE-11456.1.patch > > > Pig on Tez scripts with union directly followed by HCatStorer have a problem > due to HCatStorer not honoring mapreduce.output.basename and always using > part. Tez sets mapreduce.output.basename to part-v000-o000 (vertex id > followed by output id). With union optimizer, Pig uses vertex groups to write > directly from both the vertices to the final output directory. Since hcat > ignores the mapreduce.output.basename, both the vertices produce > part-r-0000<n> and when they are moved from the temp location to the final > directory, they just overwrite each other. There is no failure and only one > of the files with that name makes it into the final directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)