Hi

If you just want to make sure some key goes into the same subtask, does
custom key selector[1] help?

For the keygroup and subtask information, you can ref to
KeyGroupRangeAssignment[2] for more info, and the max parallelism logic you
can ref to doc[3]

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/api_concepts.html#define-keys-using-key-selector-functions
[2]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/KeyGroupRangeAssignment.java
[3]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/parallel.html#setting-the-maximum-parallelism

Best,
Congxian


杨东晓 <laolang...@gmail.com> 于2020年1月9日周四 上午7:47写道:

> Hi , I'm trying to do some optimize about Flink 'keyby' processfunction.
> Is there any possible I can find out one key belongs to which key-group
> and essentially find out one key-group belongs to which subtask.
> The motivation I want to know that is we want to  force the data records
> from upstream still goes to same taskmanager downstream subtask .Which
> means even if we use a keyedstream function we still want no cross jvm
> communication happened during run time.
> And if we can achieve that , can we also avoid the expensive cost for
> record serialization because data is only transferred in same taskmanager
> jvm instance?
>
> Thanks.
>

Reply via email to