Miklos Szurap created HDFS-13623:
------------------------------------
Summary: getContentSummary to return ContentSummary without hidden
files
Key: HDFS-13623
URL: https://issues.apache.org/jira/browse/HDFS-13623
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs, namenode
Affects Versions: 3.1.0
Reporter: Miklos Szurap
Improve the
[FileSystem.getContentSummary()|http://hadoop.apache.org/docs/r3.1.0/api/org/apache/hadoop/fs/FileSystem.html#getContentSummary-org.apache.hadoop.fs.Path-]
method to return ContentSummary object with "getFileCountWithoutHiddenFiles()"
and "getLengthWithoutHiddenFiles()".
That two new counter should not include hidden files and hidden directories
(and it's sub-contents).
{code:java}
public static final PathFilter HIDDEN_FILES_PATH_FILTER = new PathFilter() {
public boolean accept(Path p) {
String name = p.getName();
return !name.startsWith("_") && !name.startsWith(".");
}
};{code}
This would be especially useful for Hive: to compute table statistics with a
single {{contentSummary}} call instead of {{globStatus}} (multiple
{{listStatus}} calls) and iterating over multiple thousand of objects on client
side.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]