Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-20 Thread David Morávek
I'm soft -1 on depending on a private copy of 3rd party dependency unless necessary (in this case, it feels avoidable), but I won't block this if others think it's a good way forward. AWS SDK bump sounds like a patch that the Presto community might happily take in. Have you explored that option? (

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Jing Ge
+1 for the s3 file consolidation. We already have many issues with internal communication and talking to customers. Different file schemas are not very user friendly, btw. Best regards, Jing On Mon, Oct 9, 2023 at 6:49 PM Matthias Pohl wrote: > I would agree with David's proposal as well. > >

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Zhao, Kevin
Looks like Maomao was missed from previous replies. Adding back @Maomao. Thanks everyone for your response. We are having some discussion within AWS EMR team. Will get back to you very soon. Regards, Kevin From: Matthias Pohl Date: Tuesday, October 10, 2023 at 15:3

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-10 Thread Matthias Pohl
Just to add a bit more context to the performance test question: What I had in mind was the exists call on a (non-existing) directories in a bucket with a lot of objects. A comment from one of the SDK contributors about that call was that it could be an expensive call in an object store if implemen

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-09 Thread Matthias Pohl
I would agree with David's proposal as well. Would it make sense to come up with some performance comparisons for the different S3 implementations in the end? ...just to ensure that we're improving things or (at least) don't make things worse. Or is there something like that already somewhere? A

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-03 Thread Martijn Visser
+1 for David's suggestion. We should get away from the current approach with two abstractions and get to one rock solid one. On Mon, Oct 2, 2023 at 11:13 PM David Morávek wrote: > > Hi Maomao, > > I wonder whether it would make sense to take a stab at consolidating the S3 > filesystems instead an

Re: Support AWS SDK V2 for Flink's S3 FileSystem

2023-10-02 Thread David Morávek
Hi Maomao, I wonder whether it would make sense to take a stab at consolidating the S3 filesystems instead and introduce a native one. The whole Hadoop wrapper around the S3 client exists for legacy reasons, and it adds complexity and probably an unnecessary performance penalty. If you take a loo