[ https://issues.apache.org/jira/browse/HADOOP-16468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-16468. ------------------------------------- Fix Version/s: hadoop-13704 Resolution: Duplicate > S3AFileSystem.getContentSummary() to use listFiles(recursive) > ------------------------------------------------------------- > > Key: HADOOP-16468 > URL: https://issues.apache.org/jira/browse/HADOOP-16468 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/s3 > Affects Versions: 3.2.0 > Reporter: Steve Loughran > Priority: Major > Fix For: hadoop-13704 > > > HIVE-22054 discusses how they use getContentSummary to see if a directory is > empty. > This is implemented in FileSystem as a recursive treewalk, with all the costs > there. > Hive is moving off it; once that is in it won't be so much of an issue. But > if we wanted to speed up older versions of Hive, we could move the operation > to using a flat list > That would give us the file size rapidly; the directory count would have to > be worked out by tracking parent dirs of all paths (and all entries ending > with /), and adding them up -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org