[ https://issues.apache.org/jira/browse/FLINK-30975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Samrat Deb updated FLINK-30975: ------------------------------- Description: Currently, *Flink's S3 FileSystem* is limited to using AWS SDK V1. However, AWS strongly recommends adopting AWS SDK V2 because it offers significant improvements, including better performance, additional features, and extended maintenance support. Transitioning to AWS SDK V2 will ensure Flink remains aligned with AWS's long-term support strategy and benefits from enhancements available in the newer SDK. h3. Modules Requiring Updates To fully support AWS SDK V2, the following Flink modules need updates: # *{{flink-s3-fs-base}}* # *{{flink-s3-fs-hadoop}}* # *{{flink-s3-fs-presto}}* While the *Hadoop module* has already incorporated AWS SDK V2 support, the same cannot be said for {*}Presto's S3 FileSystem{*}, which currently lacks this capability. This gap creates a blocker for the {{flink-s3-fs-presto}} module to adopt AWS SDK V2. h3. Options to Enable AWS SDK V2 Support for Flink's S3 FileSystem # {*}Copy Presto's S3 FileSystem and Add AWS SDK V2 Support in Flink{*}: * ** Flink can maintain its own version of Presto's S3 FileSystem, updated to support AWS SDK V2. ** This approach gives Flink immediate control over the feature but increases maintenance overhead as Flink will need to manage updates independently if Presto evolves further. \{*}2. Update Presto's S3 FileSystem Directly{*}: * ** Add AWS SDK V2 support to Presto's S3 FileSystem in Presto itself. ** Flink can then use the updated Presto version that includes AWS SDK V2 support. ** While this option ensures better collaboration and reuse across projects, it depends on the Presto community’s priorities and timelines to accept and release these changes. \{*}3. Adopt Trino's S3 FileSystem{*}: * ** Trino's S3 FileSystem already supports AWS SDK V2. ** Flink could consider switching from Presto's S3 FileSystem to Trino's implementation. ** This approach avoids duplicating effort or waiting for Presto's support while benefiting from Trino's active maintenance and AWS SDK V2 support. However, it may require significant integration work and adjustments in Flink to support the Trino S3 FileSystem. h3. Transitioning to AWS SDK V2 for Flink's S3 FileSystem is essential to align with AWS's recommendations and benefit from better support. Among the proposed options: * The first option offers quick resolution but increases long-term maintenance. * The second option promotes collaboration but may be slower due to external dependencies. * The third option is the most efficient in terms of leveraging existing work but may require substantial integration effort. Choosing the right approach will depend on Flink's priorities, resources, and collaboration potential with Presto or Trino. [changelog-details |https://github.com/aws/aws-sdk-java-v2/blob/master/docs/LaunchChangelog.md] was: Currently, *Flink's S3 FileSystem* is limited to using AWS SDK V1. However, AWS strongly recommends adopting AWS SDK V2 because it offers significant improvements, including better performance, additional features, and extended maintenance support. Transitioning to AWS SDK V2 will ensure Flink remains aligned with AWS's long-term support strategy and benefits from enhancements available in the newer SDK. h3. Modules Requiring Updates To fully support AWS SDK V2, the following Flink modules need updates: # *{{flink-s3-fs-base}}* # *{{flink-s3-fs-hadoop}}* # *{{flink-s3-fs-presto}}* While the *Hadoop module* has already incorporated AWS SDK V2 support, the same cannot be said for {*}Presto's S3 FileSystem{*}, which currently lacks this capability. This gap creates a blocker for the {{flink-s3-fs-presto}} module to adopt AWS SDK V2. h3. Options to Enable AWS SDK V2 Support for Flink's S3 FileSystem # {*}Copy Presto's S3 FileSystem and Add AWS SDK V2 Support in Flink{*}: ** Flink can maintain its own version of Presto's S3 FileSystem, updated to support AWS SDK V2. ** This approach gives Flink immediate control over the feature but increases maintenance overhead as Flink will need to manage updates independently if Presto evolves further. # {*}Update Presto's S3 FileSystem Directly{*}: ** Add AWS SDK V2 support to Presto's S3 FileSystem in Presto itself. ** Flink can then use the updated Presto version that includes AWS SDK V2 support. ** While this option ensures better collaboration and reuse across projects, it depends on the Presto community’s priorities and timelines to accept and release these changes. # {*}Adopt Trino's S3 FileSystem{*}: ** Trino's S3 FileSystem already supports AWS SDK V2. ** Flink could consider switching from Presto's S3 FileSystem to Trino's implementation. ** This approach avoids duplicating effort or waiting for Presto's support while benefiting from Trino's active maintenance and AWS SDK V2 support. However, it may require significant integration work and adjustments in Flink to support the Trino S3 FileSystem. h3. Transitioning to AWS SDK V2 for Flink's S3 FileSystem is essential to align with AWS's recommendations and benefit from better support. Among the proposed options: * The first option offers quick resolution but increases long-term maintenance. * The second option promotes collaboration but may be slower due to external dependencies. * The third option is the most efficient in terms of leveraging existing work but may require substantial integration effort. Choosing the right approach will depend on Flink's priorities, resources, and collaboration potential with Presto or Trino. [changelog-details |https://github.com/aws/aws-sdk-java-v2/blob/master/docs/LaunchChangelog.md] > Enable AWS SDK V2 Support for Flink's S3 FileSystem Modules > ----------------------------------------------------------- > > Key: FLINK-30975 > URL: https://issues.apache.org/jira/browse/FLINK-30975 > Project: Flink > Issue Type: Improvement > Reporter: Samrat Deb > Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > > Currently, *Flink's S3 FileSystem* is limited to using AWS SDK V1. However, > AWS strongly recommends adopting AWS SDK V2 because it offers significant > improvements, including better performance, additional features, and extended > maintenance support. Transitioning to AWS SDK V2 will ensure Flink remains > aligned with AWS's long-term support strategy and benefits from enhancements > available in the newer SDK. > h3. Modules Requiring Updates > To fully support AWS SDK V2, the following Flink modules need updates: > # *{{flink-s3-fs-base}}* > # *{{flink-s3-fs-hadoop}}* > # *{{flink-s3-fs-presto}}* > While the *Hadoop module* has already incorporated AWS SDK V2 support, the > same cannot be said for {*}Presto's S3 FileSystem{*}, which currently lacks > this capability. This gap creates a blocker for the {{flink-s3-fs-presto}} > module to adopt AWS SDK V2. > h3. Options to Enable AWS SDK V2 Support for Flink's S3 FileSystem > # {*}Copy Presto's S3 FileSystem and Add AWS SDK V2 Support in Flink{*}: > * > ** Flink can maintain its own version of Presto's S3 FileSystem, updated to > support AWS SDK V2. > ** This approach gives Flink immediate control over the feature but > increases maintenance overhead as Flink will need to manage updates > independently if Presto evolves further. > \{*}2. Update Presto's S3 FileSystem Directly{*}: > * > ** Add AWS SDK V2 support to Presto's S3 FileSystem in Presto itself. > ** Flink can then use the updated Presto version that includes AWS SDK V2 > support. > ** While this option ensures better collaboration and reuse across projects, > it depends on the Presto community’s priorities and timelines to accept and > release these changes. > \{*}3. Adopt Trino's S3 FileSystem{*}: > * > ** Trino's S3 FileSystem already supports AWS SDK V2. > ** Flink could consider switching from Presto's S3 FileSystem to Trino's > implementation. > ** This approach avoids duplicating effort or waiting for Presto's support > while benefiting from Trino's active maintenance and AWS SDK V2 support. > However, it may require significant integration work and adjustments in Flink > to support the Trino S3 FileSystem. > h3. > Transitioning to AWS SDK V2 for Flink's S3 FileSystem is essential to align > with AWS's recommendations and benefit from better support. Among the > proposed options: > * The first option offers quick resolution but increases long-term > maintenance. > * The second option promotes collaboration but may be slower due to external > dependencies. > * The third option is the most efficient in terms of leveraging existing > work but may require substantial integration effort. > Choosing the right approach will depend on Flink's priorities, resources, and > collaboration potential with Presto or Trino. > > [changelog-details > |https://github.com/aws/aws-sdk-java-v2/blob/master/docs/LaunchChangelog.md] > -- This message was sent by Atlassian Jira (v8.20.10#820010)