[
https://issues.apache.org/jira/browse/HADOOP-19272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882121#comment-17882121
]
ASF GitHub Bot commented on HADOOP-19272:
-----------------------------------------
steveloughran opened a new pull request, #7048:
URL: https://github.com/apache/hadoop/pull/7048
Disables all logging by the AWS SDK Transfer Manager.
This is done in ClientManagerImpl construction so is automatically done
during S3AFS initialization.
ITests verify that
* It is possible to restore the warning log. This verifies the validity of
the test suite, and will identify when an SDK update fixes this regression.
* Constructing an S3A FS instance will disable the logging.
The log manipulation code is lifted from Cloudstore, where it was used to
dynamically enable logging. It uses reflection to load the Log4J binding; all
uses of the API catch and swallow exceptions.
This is needed to avoid failures when running against different log backends
This is an emergency fix -we could come up with a better design for the
reflection based code using the new DynMethods classes. But this is based on
working code, which is always good.
### How was this patch tested?
New ITests
### For code changes:
- [X] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> S3A: AWS SDK 2.25.53 warnings logged about transfer manager not using CRT
> client
> --------------------------------------------------------------------------------
>
> Key: HADOOP-19272
> URL: https://issues.apache.org/jira/browse/HADOOP-19272
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.4.0, 3.5.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Attachments: output.txt
>
>
> When an S3 transfer manager is created for renaming/download a new message is
> logged telling off the caller for not using the CRT client.
> {code}
> 5645:2024-09-13 16:29:17,375 [setup] WARN s3.S3TransferManager
> (LoggerAdapter.java:warn(225)) - The provided S3AsyncClient is an instance of
> MultipartS3AsyncClient, and thus multipart download feature is not enabled.
> To benefit from all features, consider using
> S3AsyncClient.crtBuilder().build() instead.
> {code}
> This is a change in the SDK to tell us developers off -yet it is visible to
> end users who don't benefit from it and for which it only creates confusion.
> It appears to have been downgraded to debug in the AWS trunk code in PR "S3
> Async Client - Multipart download (#5164) -but:
> * it is too late to upgrade and qualify a new version for 3.4.1; downgrading
> is all we can do
> * there is no guarantee this log message or similar will reoccur.
> Plan
> 1. Revert from 3.4.1
> 2. lift code from cloudstore library which uses reflection to access and
> manipulate log4j logs where present
> 3. downgrade all transfer manager log levels to NONE.
> 4. File an AWS report about how this is an incompatible regression, identify
> how their process can evolve, particularly in the area of code guidelines
> about safe logging use.
> I also intend to tighten up our review process to support more rigorous
> detection of new .warn() messages in the AWS SDK. I'm going to propose that
> as well as requiring review of our test/CLI output, we require ripgrep scans
> of .warn(/.error( in SDK source, audit of any new changes. by saving the
> output of the previous iteration, it'll be straightforward to identify new
> changes -but not changes in codepaths which change their frequency of
> appearance.
> I think we should revisit whether or not to move off the xfer manager in the
> past. We've discussed it in the past, and avoided it just due to maintenance
> costs. However, it is pushing maintenance costs anyway.
> meanwhile: no new AWS SDK updates until we are confident we have our
> processes under control.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]