[ https://issues.apache.org/jira/browse/HIVE-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prabhas Kumar Samanta updated HIVE-22054: ----------------------------------------- Attachment: HIVE-22054.2.patch > Avoid recursive listing to check if a directory is empty > -------------------------------------------------------- > > Key: HIVE-22054 > URL: https://issues.apache.org/jira/browse/HIVE-22054 > Project: Hive > Issue Type: Bug > Components: Metastore > Affects Versions: 0.13.0, 1.2.0, 2.1.0, 3.1.1, 2.3.5 > Reporter: Prabhas Kumar Samanta > Assignee: Prabhas Kumar Samanta > Priority: Major > Attachments: HIVE-22054.2.patch, HIVE-22054.patch > > > During drop partition on a managed table, first we delete the directory > corresponding to the partition. After that we recursively delete the parent > directory as well if parent directory becomes empty. To do this emptiness > check, we call Warehouse::getContentSummary(), which in turn recursively > check all files and subdirectories. This is a costly operation when a > directory has a lot of files or subdirectories. This overhead is even more > prominent for cloud based file systems like s3. And for emptiness check, this > is unnecessary too. > This is recursive listing was introduced as part of HIVE-5220. Code snippet > for reference : > {code:java} > // Warehouse.java > public boolean isEmpty(Path path) throws IOException, MetaException { > ContentSummary contents = getFs(path).getContentSummary(path); > if (contents != null && contents.getFileCount() == 0 && > contents.getDirectoryCount() == 1) { > return true; > } > return false; > } > // HiveMetaStore.java > private void deleteParentRecursive(Path parent, int depth, boolean mustPurge, > boolean needRecycle) > throws IOException, MetaException { > if (depth > 0 && parent != null && wh.isWritable(parent)) { > if (wh.isDir(parent) && wh.isEmpty(parent)) { > wh.deleteDir(parent, true, mustPurge, needRecycle); > } > deleteParentRecursive(parent.getParent(), depth - 1, mustPurge, > needRecycle); > } > } > // Note: FileSystem::getContentSummary() performs a recursive listing.{code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)