[ 
https://issues.apache.org/jira/browse/HADOOP-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukund Thakur resolved HADOOP-17400.
------------------------------------
    Resolution: Fixed

> Optimize S3A for maximum performance in directory listings
> ----------------------------------------------------------
>
>                 Key: HADOOP-17400
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17400
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Mukund Thakur
>            Priority: Major
>
> Make listing in applications as fast as we can get it especially for query 
> planning.
> * All operations used in listing directories for query planning etc to be 
> optimized for their primary use: being passed directories (not files) and so 
> make that faster even at the expense of  more remote IO when handed files or 
> empty directories.
> * remove needless calls to S3 wherever possible (e.g. {{getFileStatus("/")}}, 
> making bucket existence probes optional)
> * Support/enable Asynchronous IO where possible.
>  
> Review higher level APIs (glob status) and uses on the FsShell and optimize 
> their use by minimising invocations or FS API calls, with bonus goal of 
> reduce/minimize risk of 404 caching.
> Work with downstream projects to move to FS APIs which work best in this 
> world -primarily the recursive listing operations and those which return 
> RemoteIterator<FileStatus> -and so make any asynchronous page fetching 
> operations useful. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to