Hi Jaffe,

Thanks for your reply, I will try to use a Custom Partioner.

Den tor. 14. jan. 2021 kl. 19.39 skrev Jaffe, Julian <
julianja...@activision.com>:

> Martin,
>
>
>
> You can use `.partitionCustom` and provide a partitioner if you want to
> control explicitly how elements are distributed to downstream tasks.
>
>
>
> *From: *Martin Frank Hansen <m...@berlingskemedia.dk>
> *Reply-To: *"m...@berlingskemedia.dk" <m...@berlingskemedia.dk>
> *Date: *Thursday, January 14, 2021 at 1:48 AM
> *To: *user <user@flink.apache.org>
> *Subject: *Deterministic rescale for test
>
>
>
> Hi,
>
> I am trying to make a test-suite for our Flink jobs, and are having
> problems making the input-data deterministic.
>
> We are reading a file-input with parallelism 1 and want to rescale to a
> higher parallelism, such that the ordering of the data is the same every
> time.
>
> I have tried using rebalance, rescale but it seems to randomly distribute
> data between partitions. We don't need something optimized, we just need
> the same distribution for every run.
>
> Is this possible?
>
>
> Some code:
>
> val env: StreamExecutionEnvironment = StreamExecutionEnvironment.
> *getExecutionEnvironment*env.setStreamTimeCharacteristic(TimeCharacteristic.*EventTime*)
> env.setParallelism(parallelism)
> val rawStream: DataStream[String] = env.readTextFile(s3Path).setParallelism(1)
>
> rawStream.rescale
>
> ...
>
> best regards
>
>
>
> --
>
> *Martin Frank Hansen*
>
>
>


-- 

Martin Frank Hansen

Reply via email to