+1 for David's suggestion. We should get away from the current approach with two abstractions and get to one rock solid one.
On Mon, Oct 2, 2023 at 11:13 PM David Morávek <d...@apache.org> wrote: > > Hi Maomao, > > I wonder whether it would make sense to take a stab at consolidating the S3 > filesystems instead and introduce a native one. The whole Hadoop wrapper > around the S3 client exists for legacy reasons, and it adds complexity and > probably an unnecessary performance penalty. > > If you take a look at the underlying presto implementation, it's actually > not too complex to adapt to Flink interfaces (since you're proposing to > maintain a copy of it anyway). > > Overall, the S3 FS is probably the most used one that we have so this could > be rather high impact. It would also eliminate user confusion when choosing > the implementation to use. > > WDYT? > > Best, > D. > > On Fri, Sep 29, 2023 at 2:41 PM Min, Maomao <mimao...@amazon.com.invalid> > wrote: > > > Hi Flink Dev, > > > > I’m Maomao, a developer from AWS EMR. > > > > Recently, our team is working on adding AWS SDK V2 support for Flink’s S3 > > Filesystem. During development, we found out that our work was blocked by > > Presto. This is because that Presto still uses AWS SDK V1 and won’t add > > support for AWS SDK V2 in short term. To unblock, our team proposed several > > options and I’ve created a JIRA issue as here< > > https://issues.apache.org/jira/browse/FLINK-33157>. > > > > Since our team plans to contribute this work back to the community later, > > we’d like to collect feedback from the community about the options we > > proposed in the long term so that the community won’t need to duplicate > > this work in the future. > > > > Best, > > Maomao > > > >