Re: Parallisation of S3 write sink

2020-04-03 Thread David Magalhães
Thanks for your feedback Till. I think in this scenario the best approach is to go into the ThreadPool. On Fri, Apr 3, 2020 at 1:47 PM Till Rohrmann wrote: > Hi David, > > I assume that you have written your own TwoPhaseCommitSink which writes to > S3, right? If that is the case, then it is main

Re: Parallisation of S3 write sink

2020-04-03 Thread Till Rohrmann
Hi David, I assume that you have written your own TwoPhaseCommitSink which writes to S3, right? If that is the case, then it is mainly up to your implementation how it writes files to S3. If your S3 client supports uploading multiple files concurrently, then you should go for it. Async I/O won't

Parallisation of S3 write sink

2020-04-03 Thread David Magalhães
I have a scenario where multiple small files need to be written on S3. I'm using TwoPhaseCommit sink since I have a specific scenario where I can't use StreamingFileSink. I've notice that because the way the S3 write is done (sequencially), the checkpoint is timining out (10 minutes), because it t