Hi Davood
Maybe a custom KeySelector can be helpful, you can define the key used to 
partition the stream. You can ref the code[1] for detail.

[1] 
https://github.com/apache/flink/blob/8d05e91945c6c8d83f9924c00890ccf350f1f36f/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/partitioner/KeyGroupStreamPartitioner.java#L58

Best, Congxian
On Apr 5, 2019, 06:35 +0800, Davood Rafiei <rafieidavo...@gmail.com>, wrote:
> Hi all,
>
> I partition DataStream (say dsA) with parallelism 2 and get KeyedStream (say 
> ksA) with parallelism 2.
> Depending on my keys in dsA, one partition remains empty in ksA.
> For example when my keys are 10 and 20 in dsA, then both partitions in ksA 
> are full.
> However, with keys 1000 and 1001, only one partition receives all of the 
> upstream data in ksA.
> Is there any way to get information about key ranges for each downstream 
> partitions?
> Or is there any way to overcome this issue?
> We can assume that I know all possible keys (in this case 2 different keys) 
> in dsA and therefore I want all partitions in ksA to be fully utilized.
>
> Thanks,
> Davood
>

Reply via email to