Hi All, Poorvank (cc'ed) and I are writing to start a discussion about a potential improvement for Flink, creating a new, native S3 filesystem independent of Hadoop/Presto.
The goal of this proposal is to address several challenges related to Flink's S3 integration, simplifying flink-s3-filesystem. If this discussion gains positive traction, the next step would be to move forward with a formalised FLIP. The Challenges with the Current S3 Connectors Currently, Flink offers two primary S3 filesystems, flink-s3-fs-hadoop[1] and flink-s3-fs-presto[2]. While functional, this dual-connector approach has few issues: 1. The flink-s3-fs-hadoop connector adds an additional dependency to manage. Upgrades like AWS SDK v2 are more dependent on Hadoop/Presto to support first and leverage in flink-s3-filesystem. Sometimes it's restrictive to leverage features directly from the AWS SDK. 2. The flink-s3-fs-presto connector was introduced to mitigate the performance issues of the Hadoop connector, especially for checkpointing. However, it lacks a RecoverableWriter implementation. Sometimes it's confusing for Flink users, highlighting the need for a single, unified solution. *Proposed Solution:* A Native, Hadoop-Free S3 Filesystem I propose we develop a new filesystem, let's call it flink-s3-fs-native, built directly on the modern AWS SDK for Java v2. This approach would be free of any Hadoop or Presto dependencies. I have done a small prototype to validate [3] This is motivated by trino<>s3 [4]. The Trino project successfully undertook a similar migration, moving from Hadoop-based object storage clients to their own native implementations. The new Flink S3 filesystem would: 1. Provide a single, unified connector for all S3 interactions, from state backends to sinks. 2. Implement a high-performance S3RecoverableWriter using S3's Multipart Upload feature, ensuring exactly-once sink semantics. 3. Offer a clean, self-contained dependency, drastically simplifying setup and eliminating external dependencies. A Phased Migration Path To ensure a smooth transition, we could adopt a phased approach on a very high level : Phase 1: Introduce the new native S3 filesystem as an optional, parallel plugin. This would allow for community testing and adoption without breaking existing setups. Phase 2: Once the native connector achieves feature parity and proven stability, we will update the documentation to recommend it as the default choice for all S3 use cases. Phase 3: In a future major release, the legacy flink-s3-fs-hadoop and flink-s3-fs-presto connectors could be formally deprecated, with clear migration guides provided for users. I would love to hear the community's thoughts on this. A few questions to start the discussion: 1. What are the biggest pain points with the current S3 filesystem? 2. Are there any critical features from the Hadoop S3A client that are essential to replicate in a native implementation? 3. Would a simplified, non-dependent S3 experience be a valuable improvement for Flink use cases? Cheers, Samrat [1] https://github.com/apache/flink/tree/master/flink-filesystems/flink-s3-fs-hadoop [2] https://github.com/apache/flink/tree/master/flink-filesystems/flink-s3-fs-presto [3] https://github.com/Samrat002/flink/pull/4 [4] https://github.com/trinodb/trino/tree/master/lib/trino-filesystem-s3
