[ https://issues.apache.org/jira/browse/HIVE-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895040#comment-16895040 ]
Hive QA commented on HIVE-22054: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12976094/HIVE-22054.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16710 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18188/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18188/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18188/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12976094 - PreCommit-HIVE-Build > Avoid recursive listing to check if a directory is empty > -------------------------------------------------------- > > Key: HIVE-22054 > URL: https://issues.apache.org/jira/browse/HIVE-22054 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 0.13.0, 1.2.0, 2.1.0, 3.1.1, 2.3.5 > Reporter: Prabhas Kumar Samanta > Assignee: Prabhas Kumar Samanta > Priority: Major > Attachments: HIVE-22054.2.patch, HIVE-22054.patch > > > During drop partition on a managed table, first we delete the directory > corresponding to the partition. After that we recursively delete the parent > directory as well if parent directory becomes empty. To do this emptiness > check, we call Warehouse::getContentSummary(), which in turn recursively > check all files and subdirectories. This is a costly operation when a > directory has a lot of files or subdirectories. This overhead is even more > prominent for cloud based file systems like s3. And for emptiness check, this > is unnecessary too. > This is recursive listing was introduced as part of HIVE-5220. Code snippet > for reference : > {code:java} > // Warehouse.java > public boolean isEmpty(Path path) throws IOException, MetaException { > ContentSummary contents = getFs(path).getContentSummary(path); > if (contents != null && contents.getFileCount() == 0 && > contents.getDirectoryCount() == 1) { > return true; > } > return false; > } > // HiveMetaStore.java > private void deleteParentRecursive(Path parent, int depth, boolean mustPurge, > boolean needRecycle) > throws IOException, MetaException { > if (depth > 0 && parent != null && wh.isWritable(parent)) { > if (wh.isDir(parent) && wh.isEmpty(parent)) { > wh.deleteDir(parent, true, mustPurge, needRecycle); > } > deleteParentRecursive(parent.getParent(), depth - 1, mustPurge, > needRecycle); > } > } > // Note: FileSystem::getContentSummary() performs a recursive listing.{code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)