Miklos Szurap created HDFS-13623: ------------------------------------ Summary: getContentSummary to return ContentSummary without hidden files Key: HDFS-13623 URL: https://issues.apache.org/jira/browse/HDFS-13623 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs, namenode Affects Versions: 3.1.0 Reporter: Miklos Szurap
Improve the [FileSystem.getContentSummary()|http://hadoop.apache.org/docs/r3.1.0/api/org/apache/hadoop/fs/FileSystem.html#getContentSummary-org.apache.hadoop.fs.Path-] method to return ContentSummary object with "getFileCountWithoutHiddenFiles()" and "getLengthWithoutHiddenFiles()". That two new counter should not include hidden files and hidden directories (and it's sub-contents). {code:java} public static final PathFilter HIDDEN_FILES_PATH_FILTER = new PathFilter() { public boolean accept(Path p) { String name = p.getName(); return !name.startsWith("_") && !name.startsWith("."); } };{code} This would be especially useful for Hive: to compute table statistics with a single {{contentSummary}} call instead of {{globStatus}} (multiple {{listStatus}} calls) and iterating over multiple thousand of objects on client side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org