[ https://issues.apache.org/jira/browse/HIVE-15093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15646025#comment-15646025 ]
Hive QA commented on HIVE-15093: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12837843/HIVE-15093.8.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10628 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.common.TestBlobStorageUtils.testValidAndInvalidFileSystems (batchId=235) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2012/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2012/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2012/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12837843 - PreCommit-HIVE-Build > S3-to-S3 Renames: Files should be moved individually rather than at a > directory level > ------------------------------------------------------------------------------------- > > Key: HIVE-15093 > URL: https://issues.apache.org/jira/browse/HIVE-15093 > Project: Hive > Issue Type: Sub-task > Components: Hive > Affects Versions: 2.1.0 > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Attachments: HIVE-15093.1.patch, HIVE-15093.2.patch, > HIVE-15093.3.patch, HIVE-15093.4.patch, HIVE-15093.5.patch, > HIVE-15093.6.patch, HIVE-15093.7.patch, HIVE-15093.8.patch > > > Hive's MoveTask uses the Hive.moveFile method to move data within a > distributed filesystem as well as blobstore filesystems. > If the move is done within the same filesystem: > 1: If the source path is a subdirectory of the destination path, files will > be moved one by one using a threapool of workers > 2: If the source path is not a subdirectory of the destination path, a single > rename operation is used to move the entire directory > The second option may not work well on blobstores such as S3. Renames are not > metadata operations and require copying all the data. Client connectors to > blobstores may not efficiently rename directories. Worst case, the connector > will copy each file one by one, sequentially rather than using a threadpool > of workers to copy the data (e.g. HADOOP-13600). > Hive already has code to rename files using a threadpool of workers, but this > only occurs in case number 1. > This JIRA aims to modify the code so that case 1 is triggered when copying > within a blobstore. The focus is on copies within a blobstore because > needToCopy will return true if the src and target filesystems are different, > in which case a different code path is triggered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)