Gabor Bota created HADOOP-16185:
-----------------------------------
Summary: S3Guard: Optimize performance of handling OOB operations
in non-authoritative mode
Key: HADOOP-16185
URL: https://issues.apache.org/jira/browse/HADOOP-16185
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Affects Versions: 3.1.0
Reporter: Gabor Bota
HADOOP-15999 modifies the S3Guard's non-authoritative mode, so when S3Guard
runs non-authoritative, every {{fs.getFileStatus}} will check S3 because we
don't handle the MetadataStore as a single source of truth. This has a negative
performance impact.
In other words HADOOP-15999 is going to reinstate the HEAD on every read, so
making non-auth S3Guard a bit slower. We could think about addressing that by
moving the checks into the input stream itself. That is: the first GET which
returns data will also act as the metadata check. That'd mean the read context
will need updating with some "metastoreProcessHeader" callback to invoke on the
first GET.
The good news is that because it's reading a file, its only one HTTP HEAD
request: no need to do any of the other two directory probes except in the case
that the file isn't there.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]