Hi, 1. In case of S3 FileSystem, Flink uses the multipart upload process [1] for better performance. It might not be obvious at first by looking at the docs, but it's noted at the bottom of the FileSystem page [2] For more information you can also check FLINK-9751 and FLINK-9752
2. In case of local FileSystem it always starts with a dot according to LocalRecoverableWriter [3] but make sure to check the implementation of RecoverableWriter for the FileSystem you want to use. Regards, Mate [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html [2] https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/connectors/datastream/filesystem/#s3-specific [3] https://github.com/apache/flink/blob/1e0b58aa8d962469fa9dd7b470037aeaece43500/flink-core/src/main/java/org/apache/flink/core/fs/local/LocalRecoverableWriter.java#L129 Chirag Dewan via user <user@flink.apache.org> ezt írta (időpont: 2023. márc. 29., Sze, 9:07): > Hi, > > > > We are tying to use Flink's File sink to distribute files to AWS S3 > storage. We are using Flink provided Hadoop s3a connector as plugin. > > We have some observations that we needed to clarify: > > 1. When using file sink for local filesystem distribution, we can see that > the sink creates 3 sets of files - in progress, pending (on rolling) and > finished (upon checkpointing). But with S3 file sink we can see only the > finished files, in the S3 buckets. > > So we wanted to understand where does the sink creates the in-progress and > pending files for S3 file sink ? > > > 2. We can also see with local file system sink, the in-progress and > pending file names follow the nomenclature: > .<prefix>-<uid>-<partFileIndex>.inprogress.uid-<suffix> > > There is a dot at the begining of the filename, may be flink is trying to > create these files as hidden files. But in the flink documentation this is > not mentioned. > > So can we assume that the in-progress and pending filenames shall always > start with a dot ? > > thanks a lot in advance > > >