Hi, if output is same, why not just only one intermediate data set is ok
2017-03-14 14:36 GMT+08:00 Zhijiang(wangzhijiang999) < wangzhijiang...@aliyun.com>: > Hi , > > I think there is no difference between JobVertex(A) and JobVertex(B). > Because the JobVertex(C) is not shown in the right graph, it may mislead > you. > There should be another intermediate result partition between JobVertex(B) > and JobVertex(C) for each parallelism, and that is the same case with > JobVertex(A). > > > Cheers, > > Zhijiang > > ------------------------------------------------------------------ > 发件人:윤형덕 <ynoo...@naver.com> > 发送时间:2017年3月13日(星期一) 12:43 > 收件人:user <user@flink.apache.org> > 主 题:multiple consumer of intermediate data set > > Hi All, > > > > figure1 > https://ci.apache.org/projects/flink/flink-docs-release-1.2/fig/job_and_ > execution_graph.svg > > > > as we can see in figure1, JobVertex(B) has two consumer( JobVertex(C) and > JobVertex(D) ) > > and accordingly Intermediate Data Set of JobVertex(B) has two consumer( > JobVertex(C) and JobVertex(D) ) > but in case of JobVertex(A), though it has two consumer( JobVertex(B) and > JobVertex(D) ) same as JobVertex(B) > > it has two separate intermediates data set and each intermediate data > set has one consumer. > i couldn't understand why... for me it looks same case but why one has one > Intermediate Data Set and another has two? > could anyone explain what is difference between JobVertex(A) and > JobVertex(B)? > > >