Hi everyone, I’m using a Kafka source with a lot of watermark skew (i.e. new partitions were added to the topic over time). The sink is a FileIO.Write().withNumShards(1) to get ~ 1 file per day & an early trigger to write at most 40,000 records per file. Unfortunately it looks like there's 1 thread trying to write files for all the various days, instead of writing multiple days' files in parallel. Is there anything I could do here to parallelize the process? All of this is with the Flink runner.
Mike. Ladder <http://bit.ly/1VRtWfS>. The smart, modern way to insure your life.