Re: How can I find out which key group belongs to which subtask

2020-01-13 Thread Till Rohrmann
This feature won't be more public than it is today. Cheers, Till On Fri, Jan 10, 2020 at 9:51 PM 杨东晓 wrote: > Thanks Till , I will do some test about this , will this be some public > feature in next release version or later? > > Till Rohrmann 于2020年1月10日周五 上午6:15写道: > >> Hi, >> >> you would n

Re: How can I find out which key group belongs to which subtask

2020-01-10 Thread 杨东晓
Thanks Till , I will do some test about this , will this be some public feature in next release version or later? Till Rohrmann 于2020年1月10日周五 上午6:15写道: > Hi, > > you would need to set the co-location constraint in order to ensure that > the sub-tasks of operators are deployed to the same machine

Re: How can I find out which key group belongs to which subtask

2020-01-10 Thread 杨东晓
Thanks Zhijiang, looks like serialization will always be there in keyed stream Zhijiang 于2020年1月10日周五 上午12:08写道: > Only chained operators can avoid record serialization cost, but the > chaining mode can not support keyed stream. > If you want to deploy downstream with upstream in the same task m

Re: How can I find out which key group belongs to which subtask

2020-01-10 Thread Till Rohrmann
Hi, you would need to set the co-location constraint in order to ensure that the sub-tasks of operators are deployed to the same machine. It effectively means that subtasks a_i, b_i of operator a and b will be deployed to the same slot. This feature is not super well exposed but you can take a lo

Re: How can I find out which key group belongs to which subtask

2020-01-10 Thread Zhijiang
Only chained operators can avoid record serialization cost, but the chaining mode can not support keyed stream. If you want to deploy downstream with upstream in the same task manager, it can avoid network shuffle cost which can still get performance benefits. As I know @Till Rohrmann has impleme

Re: How can I find out which key group belongs to which subtask

2020-01-09 Thread 杨东晓
Thanks Congxian! My purpose is not only make data goes into one same subtask but the specific subtask which belongs to same taskmanager with upstream record. The key idea is to avoid shuffling between taskmanagers. I think the KeyGroupRangeAssignment.java

Re: How can I find out which key group belongs to which subtask

2020-01-09 Thread Congxian Qiu
Hi If you just want to make sure some key goes into the same subtask, does custom key selector[1] help? For the keygroup and subtask information, you can ref to KeyGroupRangeAssignment[2] for more info, and the max parallelism logic you can ref to doc[3] [1] https://ci.apache.org/projects/flink/