Hi hjw To rescale data for dim join, I think you can use `partition by` in sql before `dim join` which will redistribute data by specific column. In addition, you can add cache for `dim table` to improve performance too.
Best, Shammon FY On Tue, Apr 4, 2023 at 10:28 AM Hang Ruan <ruanhang1...@gmail.com> wrote: > Hi, hiw, > > IMO, I think the parallelism 1 is enough for you job if we do not consider > the sink. I do not know why you need set the lookup join operator's > parallelism to 6. > The SQL planner will help us to decide the type of the edge and we can not > change it. > Maybe you could share the Execution graph to provide more information. > > Best, > Hang > > hjw <hjw_em...@163.com> 于2023年4月4日周二 00:37写道: > >> For example. I create a kafka source to subscribe the topic that have >> one partition and set the default parallelism of the job to 6.The next >> operator of kafka source is that lookup join a mysql table.However, the >> relationship between the kafka Source and the Lookup join operator is >> Forward, so only one subtask in the Lookup join operator can receive data.I >> want to set the relationship between the kafka Source and the Lookup join >> operator is reblance so that all subtask in Lookup join operator can >> recevie data. >> >> Env: >> Flink version:1.15.1 >> >> >> -- >> Best, >> Hjw >> >