[ https://issues.apache.org/jira/browse/HIVE-25990?focusedWorklogId=737883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-737883 ]
ASF GitHub Bot logged work on HIVE-25990: ----------------------------------------- Author: ASF GitHub Bot Created on: 08/Mar/22 00:31 Start Date: 08/Mar/22 00:31 Worklog Time Spent: 10m Work Description: rbalamohan commented on a change in pull request #3058: URL: https://github.com/apache/hive/pull/3058#discussion_r821221192 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ########## @@ -1514,6 +1523,17 @@ public static void mvFileToFinalPath(Path specPath, Configuration hconf, fs.delete(taskTmpPath, true); } + private static void createFileList(Set<FileStatus> filesKept, Path srcPath, Path targetPath, FileSystem fs) + throws IOException { + try (FSDataOutputStream outStream = fs.create(new Path(targetPath, BLOB_FILES_KEPT))) { + outStream.writeBytes(srcPath.toString() + System.lineSeparator()); Review comment: Is it needed to write the srcPath as first entry? Only file paths can be written out and during reading, first entry's parent directory can be assumed as srcPath? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 737883) Time Spent: 0.5h (was: 20m) > Optimise multiple copies in case of CTAS in external tables for Object stores > ----------------------------------------------------------------------------- > > Key: HIVE-25990 > URL: https://issues.apache.org/jira/browse/HIVE-25990 > Project: Hive > Issue Type: Improvement > Reporter: Ayush Saxena > Assignee: Ayush Saxena > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Presently for CTAS with external tables, there are two renames, operations, > one from tmp to _ext and then from _ext to actual target. > In case of object stores, the renames lead to actual copy. Avoid renaming by > avoiding rename from tmp to _ext, but by creating a list of files to be > copied in that directly, which can be consumed in the move task, to copy > directly from tmp to actual target. -- This message was sent by Atlassian Jira (v8.20.1#820001)