Izeren commented on PR #27788:
URL: https://github.com/apache/flink/pull/27788#issuecomment-4133740922

   I ran Claude on it, and it has some good points. I will have a deeper look 
later today or tomorrow
   
   ```
   PR: [FLINK-39166] Add bucket-level configuration support to Native S3 
FileSystem
     Author: Samrat002
     Verdict: REQUEST CHANGES
   
     Critical (2)
   
     1. Per-bucket clients don't inherit global config — When a bucket-specific 
config is defined, settings like maxRetries, maxConnections, 
credentialsProviderClasses, and assumeRoleSessionDuration silently fall back
      to hardcoded defaults instead of inheriting the globally configured 
values. For example, a global s3.retry.max-num-retries: 10 would be ignored — 
per-bucket clients retry only 3 times.
     2. S3ClientProviderCache is architecturally unnecessary — Flink's 
FileSystem.getUnguardedFileSystem() already caches filesystem instances by 
FSKey(scheme, authority), where authority = bucket name. Each
     NativeS3FileSystem instance serves exactly one bucket, so the cache with 
its HashMap, lock, and async close logic will never hold more than one entry. 
This adds complexity and synchronization overhead for no
     benefit.
   
     Major (5)
   
     3. Mixed concerns in one commit — Bundles unrelated 
CONNECTION_TIMEOUT/SOCKET_TIMEOUT config options and bulkCopy constructor 
refactoring with bucket-level config. Should be separate commits.
     4. Missing documentation — The primary feature 
(s3.bucket.<name>.<property> config format) is completely undocumented in the 
README. Only 2 lines were added for timeout config.
     5. No tests for S3ClientProviderCache — Has non-trivial concurrent logic 
(synchronized access, closed-state check, async close with timeout) but zero 
test coverage.
     6. Three telescoping constructors with 10-15 params each — Fragile API; 
should use a builder pattern or a single constructor since only the factory 
instantiates it.
     7. No validation for partial credentials — Setting access-key without 
secret-key (or vice versa) silently creates a broken config that will fail at 
runtime with a cryptic AWS SDK error instead of a clear
     config-time error.
   
     Minor (5)
   
     8. BucketConfigProvider should be package-private (only used within its 
package)
     9. S3BucketConfig.Builder constructor is public but outer class is 
package-private
     10. Duplicate null-handling for bucketConfig in both caller and callee
     11. closed field in cache missing @GuardedBy("cacheLock") annotation
     12. Lambda (config) -> createClientProviderForBucketConfig(config) can be 
a method reference
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to