Hi Yun,

Thanks a lot for your reply!

Regarding the parallelism, I think `table.exec.resource.default-parallelism` 
that you mentioned is a good
alternative, but it requires a set operation before running each query. And 
since it’s a `default` value, I
suppose there should be an option with higher priority that can overrides it? 
I’m looking for that if possible.

And for memory settings, I think in session cluster mode, jobmanager would 
request resources according 
to the job’s resource requirement, so the taskmanager’s memory configuration 
can be specified at job 
level. 

In FLIP-53[1],  there’s a some words about this:

> Current status (including plans for Flink 1.10) of how to set operators' 
> resource requirements for jobs can be described as follows: 
> - SQL/Table API - Blink optimizer can set operator resources for the users, 
> according to their configurations (default: unknown) 
> - DataStream API -  There are no method / interface to set operator resources 
> at the moment. It can be added in the future. 
> - DataSet API - There are existing user interfaces to set operator resources.

I wonder what’s the recent status. Is there any configuration or API can be 
used to adjust job resources in SQL/Table API?
In my case, approaches with DataStream API is also viable, cause I’m submitting 
the jobs programatically.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management

Best,
Paul Lam

> 2021年7月19日 21:46,Yun Gao <yungao...@aliyun.com> 写道:
> 
> Hi Paul,
> 
> For parallelism, it should be able to be set with 
> `table.exec.resource.default-parallelism` [1] ,
> and an example to set the parameter is at the first several paragraph. 
> 
> But Regarding the total process memory, I think it should be only set in the 
> cluster level since 
> it is per-cluster option: the TM should use the option on startup, before the 
> job is submitted.
> 
> Best,
> Yun
> 
> 
> 
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/config/#table-exec-resource-default-parallelism
>  
> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/config/#table-exec-resource-default-parallelism>
> 
> 
> 
> ------------------------------------------------------------------
> From:Paul Lam <paullin3...@gmail.com>
> Send Time:2021 Jul. 16 (Fri.) 12:21
> To:user <user@flink.apache.org>
> Subject:Set job specific resources in one StreamTableEnvironment
> 
> Hi,
> I’m reusing the same StreamTableEnvironment to submit multiple table/sql jobs 
> to a session cluster, 
> but I couldn’t find a proper way to specify job resources for each job (like 
> parallelism and total process 
> memory), and they all uses the cluster default. 
> 
> I have considered overriding resources specs of all nodes in the StreamGraph, 
> but it’s problematic 
> because some nodes have a parallelism limit (e.g. can’t be greater than 1).
> I think I might be missing something and there should be a better way to do 
> this. Please give me some 
> pointers. Thanks a lot!
> Best,
> Paul Lam
> 
> 

Reply via email to