[ https://issues.apache.org/jira/browse/HADOOP-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-13704. ------------------------------------- Fix Version/s: 3.3.3 Resolution: Fixed fixed in 3.3.3. thanks! > S3A getContentSummary() to move to listFiles(recursive) to count children; > instrument use > ----------------------------------------------------------------------------------------- > > Key: HADOOP-13704 > URL: https://issues.apache.org/jira/browse/HADOOP-13704 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.8.0 > Reporter: Steve Loughran > Assignee: Ahmar Suhail > Priority: Minor > Labels: pull-request-available > Fix For: 3.3.3 > > Time Spent: 3h > Remaining Estimate: 0h > > Hive and a bit of Spark use {{getContentSummary()}} to get some summary stats > of a filesystem. This is very expensive on S3A (and any other object store), > especially as the base implementation does the recursive tree walk. > Because of HADOOP-13208, we have a full enumeration of files under a path > without directory costs...S3A can/should switch to this to speed up those > places where the operation is called. > Also > * API call needs FS spec and contract tests > * S3A could instrument invocation, so as to enable real-world popularity to > be measured -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org