[ https://issues.apache.org/jira/browse/HIVE-25295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421137#comment-17421137 ]
Zhihua Deng edited comment on HIVE-25295 at 9/28/21, 3:14 AM: -------------------------------------------------------------- Have you tried HIVE-17963? there is a catch for runaway processes adding additional files into staging directory. was (Author: dengzh): Have you tried HIVE-17963, there is a catch for runaway processes adding additional files into staging directory. > "File already exist exception" during mapper/reducer retry with old hive(0.13) > ------------------------------------------------------------------------------ > > Key: HIVE-25295 > URL: https://issues.apache.org/jira/browse/HIVE-25295 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 0.13.0 > Reporter: yuquan wang > Priority: Blocker > > We are now using very old hive version(0.13) due to historical reason, and we > often meet following issue: > {code:java} > Caused by: java.io.IOException: File already > exists:s3://smart-dmp/warehouse/uploaded/ad_dmp_pixel/dt=2021-06-21/key=259f3XXXXXXX > {code} > We have investigated this issue for quite a long time, but didn't get a good > fix, so I may want to ask the hive community for help to see if there are any > solutions. > > The error is created during map/reduce stage, once an instance failed due to > some unexpected reason(for example unstable spot instance got killed), then > later retry will throw the above exception, instead of overwriting it. > > we have several guesses like following: > 1. Is it caused by orc file type? I have found similar issue like > https://issues.apache.org/jira/browse/HIVE-6341 but saw no comments there, > and our table is stored as orc style. > 2. Is the problem solved in the higher hive version? because we are also > running hive 2.3.6, but didn't meet such an issue, so want to see if version > upgrade can solve the issue? > 3.Do we have such a config that supports always cleaning up existing folders > during retry of mapper/reducer stage. I have searched all mapreduce config > but can not find one. -- This message was sent by Atlassian Jira (v8.3.4#803005)