Hi Dominik,
I think having a single output file is only possible if you set the
parallelism of the sink to 1. AFAIK it is not possible to concurrently
write to a single HDFS file from multiple clients.

Cheers,
Aljoscha

On Wed, 14 Dec 2016 at 20:57 Dominik Safaric <dominiksafa...@gmail.com>
wrote:

> Hi everyone,
>
> although this question might sound trivial, I’ve been curious about the
> following. Given a Flink topology with parallelism level set to 6 for
> example and outputting the data stream to HDFS using an instance
> RollingSink, how is the output file structured? By structured, I refer to
> the fact that this will result in 6 distinct block files, whereas I would
> like to have a single file containing all of the output values from the
> DataStream.
>
> Regards,
> Dominik

Reply via email to