Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/6108 @kl0u please link the issue once you created it. This is currently very early, in design discussions between @kl0u, me, and @aljoscha. The main points about the rewrite are - Use Flink's FileSystem abstraction, to make it work with shaded S3, swift, etc and give an easier interface - Add a proper "ChunkedWriter" abstraction to the FileSystems, which handles write, persist-on-checkpoint, and rollback-to-checkpoint in a FileSystem specific way. For example, use truncate()/append() on POSIX and HDFS, use MultiPartUploads on S3, ... - Add support for gathering large chunks across checkpoints, to make Parquet and ORC compression more effective.
---