Hi Dominik, I think having a single output file is only possible if you set the parallelism of the sink to 1. AFAIK it is not possible to concurrently write to a single HDFS file from multiple clients.
Cheers, Aljoscha On Wed, 14 Dec 2016 at 20:57 Dominik Safaric <dominiksafa...@gmail.com> wrote: > Hi everyone, > > although this question might sound trivial, I’ve been curious about the > following. Given a Flink topology with parallelism level set to 6 for > example and outputting the data stream to HDFS using an instance > RollingSink, how is the output file structured? By structured, I refer to > the fact that this will result in 6 distinct block files, whereas I would > like to have a single file containing all of the output values from the > DataStream. > > Regards, > Dominik