Changing parallelism on BucketingSink

Felix Cheung Mon, 21 Aug 2017 10:05:17 -0700

Hi,

I'm implementing a custom sink. The job is reading a DataStream into this 
custom sink.
I'd like to be able to maximize the parallelism to use all available slots in 
the cluster, but to
write to a smaller sets of files in the final output.


When I implement this sink with DataStream.writeAsText, I get a DataStreamSink 
which has the setParallelism() method.
However, when I implement using BucketingSink, to leverage the ability to 
bucket to paths and limit file sizes, it seems there is no available option to 
change the parallelism.
It seems this isn't available either in AbstractRichFunction, RichSinkFunction, 
or SinkFunction?

It seems the only way is to change the default parallelism on the "current" 
ExecutionEnvironment, before calling addSink on the DataStream?

Any suggestion would be appreciated!

Changing parallelism on BucketingSink

Reply via email to