Hi,

I am trying to make a test-suite for our Flink jobs, and are having
problems making the input-data deterministic.

We are reading a file-input with parallelism 1 and want to rescale to a
higher parallelism, such that the ordering of the data is the same every
time.

I have tried using rebalance, rescale but it seems to randomly distribute
data between partitions. We don't need something optimized, we just need
the same distribution for every run.
Is this possible?

Some code:

val env: StreamExecutionEnvironment =
StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.setParallelism(parallelism)
val rawStream: DataStream[String] = env.readTextFile(s3Path).setParallelism(1)

rawStream.rescale
...

best regards

-- 

Martin Frank Hansen

Reply via email to