Hi Jaffe, Thanks for your reply, I will try to use a Custom Partioner.
Den tor. 14. jan. 2021 kl. 19.39 skrev Jaffe, Julian < julianja...@activision.com>: > Martin, > > > > You can use `.partitionCustom` and provide a partitioner if you want to > control explicitly how elements are distributed to downstream tasks. > > > > *From: *Martin Frank Hansen <m...@berlingskemedia.dk> > *Reply-To: *"m...@berlingskemedia.dk" <m...@berlingskemedia.dk> > *Date: *Thursday, January 14, 2021 at 1:48 AM > *To: *user <user@flink.apache.org> > *Subject: *Deterministic rescale for test > > > > Hi, > > I am trying to make a test-suite for our Flink jobs, and are having > problems making the input-data deterministic. > > We are reading a file-input with parallelism 1 and want to rescale to a > higher parallelism, such that the ordering of the data is the same every > time. > > I have tried using rebalance, rescale but it seems to randomly distribute > data between partitions. We don't need something optimized, we just need > the same distribution for every run. > > Is this possible? > > > Some code: > > val env: StreamExecutionEnvironment = StreamExecutionEnvironment. > *getExecutionEnvironment*env.setStreamTimeCharacteristic(TimeCharacteristic.*EventTime*) > env.setParallelism(parallelism) > val rawStream: DataStream[String] = env.readTextFile(s3Path).setParallelism(1) > > rawStream.rescale > > ... > > best regards > > > > -- > > *Martin Frank Hansen* > > > -- Martin Frank Hansen