Hi faaron, For sink parallelism. - What is parallelism of the input of sink? The sink parallelism should be same. - Does you sql have order by or limit ? Flink batch sql not support range partition now, so it will use single parallelism to run order by.
For the memory of taskmanager. There is manage memory option to configure. [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_setup.html#managed-memory Best, Jingsong Lee On Fri, Mar 6, 2020 at 5:38 PM faaron zheng <faaronzh...@gmail.com> wrote: > Hi all, > > I am trying to use flink sql to run hive task. I use tEnv.sqlUpdate to > execute my sql which looks like "insert overtwrite ... select ...". But I > find the parallelism of sink is always 1, it's intolerable for large data. > Why it happens? Otherwise, Is there any guide to decide the memory of > taskmanager when I have two huge table to hashjoin, for example, each table > has several TB data? > > Thanks, > Faaron > -- Best, Jingsong Lee