[ https://issues.apache.org/jira/browse/HADOOP-18582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ayush Saxena resolved HADOOP-18582. ----------------------------------- Hadoop Flags: Reviewed Resolution: Fixed > No need to clean tmp files in distcp direct mode > ------------------------------------------------ > > Key: HADOOP-18582 > URL: https://issues.apache.org/jira/browse/HADOOP-18582 > Project: Hadoop Common > Issue Type: Bug > Components: tools/distcp > Affects Versions: 3.3.4 > Reporter: 10000kang > Assignee: 10000kang > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > it not necessary to do `cleanupTempFiles` while ditcp commit job in direct > mode, because it there is no temp files in direct mode. > This clean operation will increase the task execution time, because it will > get the list of files in the target path. When the number of files in the > target path is very large, this operation will be very slow. > *note* there are two patches which need to be cherrypicked when picking this > up; the original patch and a followup, both with HADOOP-18582 in the title > {code} > 3b7b79b37ae HADOOP-18582. skip unnecessary cleanup logic in distcp (#5251) > e8a6b2c2c4e HADOOP-18582. Addendum: Skip unnecessary cleanup logic in DistCp. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org