[ 
https://issues.apache.org/jira/browse/HADOOP-19272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19272:
------------------------------------
    Description: 

When an S3 transfer manager is created for renaming/download a new message is 
logged telling off the caller for not using the CRT client.

{code}
5645:2024-09-13 16:29:17,375 [setup] WARN  s3.S3TransferManager 
(LoggerAdapter.java:warn(225)) - The provided S3AsyncClient is an instance of 
MultipartS3AsyncClient, and thus multipart download feature is not enabled. To 
benefit from all features, consider using S3AsyncClient.crtBuilder().build() 
instead.
{code}

This is a change in the SDK to tell us developers off -yet it is visible to end 
users who don't benefit from it and for which it only creates confusion.

It appears to have been downgraded to debug in the AWS trunk code in PR "S3 
Async Client - Multipart download (#5164) -but:

* it is too late to upgrade and qualify a new version for 3.4.1; downgrading is 
all we can do
* there is no guarantee this log message or similar will reoccur.

Plan
1. Revert from 3.4.1
2. lift code from cloudstore library which uses reflection to access and 
manipulate log4j logs where present
3. downgrade all transfer manager log levels to NONE. 
4. File an AWS report about how this is an incompatible regression, identify 
how their process can evolve, particularly in the area of code guidelines about 
safe logging use.

I also intend to tighten up our review process to support more rigorous 
detection of new .warn() messages in the AWS SDK. I'm going to propose that as 
well as requiring review of our test/CLI output, we require ripgrep scans of 
.warn(/.error( in SDK source, audit of any new changes. by saving the output of 
the previous iteration, it'll be straightforward to identify new changes -but 
not changes in codepaths which change their frequency of appearance.

I think we should revisit whether or not to move off the xfer manager in the 
past. We've discussed it in the past, and avoided it just due to maintenance 
costs. However, it is pushing maintenance costs anyway.

meanwhile: no new AWS SDK updates until we are confident we have our processes 
under control.





  was:
In
When an S3 transfer manager is created for renaming/download a new message is 
logged telling off the caller for not using the CRT client.

{code}
5645:2024-09-13 16:29:17,375 [setup] WARN  s3.S3TransferManager 
(LoggerAdapter.java:warn(225)) - The provided S3AsyncClient is an instance of 
MultipartS3AsyncClient, and thus multipart download feature is not enabled. To 
benefit from all features, consider using S3AsyncClient.crtBuilder().build() 
instead.
{code}

This is a change in the SDK to tell us off

It appears to have been downgraded to debug in the aws trunk code "S3 Async 
Client - Multipart download (#5164) -but:

* it is too late to upgrade and qualify a new version for 3.4.1; downgrading is 
all we can do
* there is no guarantee this log message or similar will reoccur.

Plan
1. Revert from 3.4.1
2. lift code from cloudstore library which uses reflection to access and 
manipulate log4j logs where present
3. downgrade all transfer manager log levels to NONE. 
4. File an AWS report about how this is an incompatible regression, identify 
how their process can evolve, particularly in the area of code guidelines about 
safe logging use.

I also intend to tighten up our review process to support more rigorous 
detection of new .warn() messages in the AWS SDK. I'm going to propose that as 
well as requiring review of our test/CLI output, we require ripgrep scans of 
.warn(/.error( in SDK source, audit of any new changes. by saving the output of 
the previous iteration, it'll be straightforward to identify new changes -but 
not changes in codepaths which change their frequency of appearance.

meanwhile: no new AWS SDK updates until we are confident we have our processes 
under control.






> S3A: AWS SDK 2.25.53 warnings logged about transfer manager not using CRT 
> client
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-19272
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19272
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.4.0, 3.5.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: output.txt
>
>
> When an S3 transfer manager is created for renaming/download a new message is 
> logged telling off the caller for not using the CRT client.
> {code}
> 5645:2024-09-13 16:29:17,375 [setup] WARN  s3.S3TransferManager 
> (LoggerAdapter.java:warn(225)) - The provided S3AsyncClient is an instance of 
> MultipartS3AsyncClient, and thus multipart download feature is not enabled. 
> To benefit from all features, consider using 
> S3AsyncClient.crtBuilder().build() instead.
> {code}
> This is a change in the SDK to tell us developers off -yet it is visible to 
> end users who don't benefit from it and for which it only creates confusion.
> It appears to have been downgraded to debug in the AWS trunk code in PR "S3 
> Async Client - Multipart download (#5164) -but:
> * it is too late to upgrade and qualify a new version for 3.4.1; downgrading 
> is all we can do
> * there is no guarantee this log message or similar will reoccur.
> Plan
> 1. Revert from 3.4.1
> 2. lift code from cloudstore library which uses reflection to access and 
> manipulate log4j logs where present
> 3. downgrade all transfer manager log levels to NONE. 
> 4. File an AWS report about how this is an incompatible regression, identify 
> how their process can evolve, particularly in the area of code guidelines 
> about safe logging use.
> I also intend to tighten up our review process to support more rigorous 
> detection of new .warn() messages in the AWS SDK. I'm going to propose that 
> as well as requiring review of our test/CLI output, we require ripgrep scans 
> of .warn(/.error( in SDK source, audit of any new changes. by saving the 
> output of the previous iteration, it'll be straightforward to identify new 
> changes -but not changes in codepaths which change their frequency of 
> appearance.
> I think we should revisit whether or not to move off the xfer manager in the 
> past. We've discussed it in the past, and avoided it just due to maintenance 
> costs. However, it is pushing maintenance costs anyway.
> meanwhile: no new AWS SDK updates until we are confident we have our 
> processes under control.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to