Great. If no one wants to work on this ticket FLINK-15648, I will try to get this done in the next major release cycle(1.15).
Best, Yang Denis Cosmin NUTIU <dnu...@bitdefender.com> 于2021年8月31日周二 下午4:59写道: > Hi everyone, > > Thanks for getting back to me! > > > I think it would be nice if the task manager pods get their values from > the configuration file only if the pod templates don’t specify any > resources. That was the goal of supporting pod templates, right? Allowing > more custom scenarios without letting the configuration options get bloated. > > I think that's correct. In the current behavior Flink will override the > resources settings "The memory and cpu resources(including requests and > limits) will be overwritten by Flink configuration options. All other > resources(e.g. ephemeral-storage) will be retained.'[1]. After reading the > comments from FLINK-15648[2], I'm not sure that it can be done in a clean > manner with pod templates. > > > I think it is a good improvement to support different resource requests > and limits. And it is very useful especially for the CPU resource since > it heavily depends on the upstream workloads. > > I agree with you! I have limited knowledge of Flink internals but the > kubernetes.jobmanager.limit-factor and kubernetes.taskmanager.limit-factor > seems to be the right way to do it. > > [1] Native Kubernetes | Apache Flink > <https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-template> > [2] [FLINK-15648] Support to configure limit for CPU and memory > requirement - ASF JIRA (apache.org) > <https://issues.apache.org/jira/browse/FLINK-15648> > > ------------------------------ > *From:* Yang Wang <danrtsey...@gmail.com> > *Sent:* Tuesday, August 31, 2021 6:04 AM > *To:* Alexis Sarda-Espinosa <alexis.sarda-espin...@microfocus.com> > *Cc:* Denis Cosmin NUTIU <dnu...@bitdefender.com>; matth...@ververica.com > <matth...@ververica.com>; user@flink.apache.org <user@flink.apache.org> > *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and > different limits and requests > > Hi all, > > I think it is a good improvement to support different resource requests > and limits. And it is very useful > especially for the CPU resource since it heavily depends on the upstream > workloads. > > Actually, we(alibaba) have introduced some internal config options to > support this feature. WDYT? > > // The prefix of Kubernetes resource limit factor. It should not be less than > 1. The resource > // could be cpu, memory, ephemeral-storage and all other types supported by > Kubernetes. > public static final String KUBERNETES_JOBMANAGER_RESOURCE_LIMIT_FACTOR_PREFIX > = > "kubernetes.jobmanager.limit-factor."; > public static final String > KUBERNETES_TASKMANAGER_RESOURCE_LIMIT_FACTOR_PREFIX = > "kubernetes.taskmanager.limit-factor."; > > > BTW, we already have an old ticket for this feature[1]. > > > [1]. https://issues.apache.org/jira/browse/FLINK-15648 > > Best, > Yang > > Alexis Sarda-Espinosa <alexis.sarda-espin...@microfocus.com> > 于2021年8月26日周四 下午10:04写道: > > I think it would be nice if the task manager pods get their values from > the configuration file only if the pod templates don’t specify any > resources. That was the goal of supporting pod templates, right? Allowing > more custom scenarios without letting the configuration options get bloated. > > > > Regards, > > Alexis. > > > > *From:* Denis Cosmin NUTIU <dnu...@bitdefender.com> > *Sent:* Donnerstag, 26. August 2021 15:55 > *To:* matth...@ververica.com > *Cc:* user@flink.apache.org; danrtsey...@gmail.com > *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and > different limits and requests > > > > Hi Matthias, > > > > Thanks for getting back to me and for your time! > > > > We have some Flink jobs deployed on Kubernetes and running kubectl top pod > gives the following result: > > > > > NAME CPU(cores) > MEMORY(bytes) > aa-78c8cb77d4-zlmpg 8m 1410Mi > aa-taskmanager-2-2 32m 1066Mi > bb-5f7b65f95c-jwb7t 7m 1445Mi > bb-taskmanager-2-2 32m 1016Mi > cc-54d967b55d-b567x 11m 514Mi > cc-taskmanager-4-1 11m 496Mi > dd-6fbc6b8666-krhlx 10m 535Mi > dd-taskmanager-2-2 12m 522Mi > xx-6845cf7986-p45lq 53m 526Mi > xx-taskmanager-5-2 11m 507Mi > > > > During low workloads the jobs consume just about 100m CPU and during high > workloads the CPU consumption increases to 500m-1000m. Having the ability > to specify requests and limit separately would give us more deployment > flexibility. > > > > Sincerely, > > Denis > > > > On Thu, 2021-08-26 at 14:22 +0200, Matthias Pohl wrote: > > Hi Denis, > > I did a bit of digging: It looks like there is no way to specify them > independently. You can find documentation about pod templates for > TaskManager and JobManager [1]. But even there it states that for cpu and > memory, the resource specs are overwritten by the Flink configuration. The > code also reveals that limit and requests are set using the same value [2]. > > > > I'm going to pull Yang Wang into this thread. I'm wondering whether there > is a reason for that or whether it makes sense to create a Jira issue > introducing more specific configuration parameters for limit and requests. > > > > Best, > Matthias > > > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#fields-overwritten-by-flink > > [2] > https://github.com/apache/flink/blob/f64261c91b195ecdcd99975b51de540db89a3f48/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/utils/KubernetesUtils.java#L324-L332 > > > > On Thu, Aug 26, 2021 at 11:17 AM Denis Cosmin NUTIU < > dnu...@bitdefender.com> wrote: > > Hello, > > I've developed a Flink job and I'm trying to deploy it on a Kubernetes > cluster using Flink Native. > > Setting kubernetes.taskmanager.cpu=0.5 and > kubernetes.jobmanager.cpu=0.5 sets the requests and limits to 500m, > which is correct, but I'd like to set the requests and limits to > different values, something like: > > resources: > requests: > memory: "1048Mi" > cpu: "100m" > limits: > memory: "2096Mi" > cpu: "1000m" > > I've tried using pod templates from Flink 1.13 and manually patching > the Kubernetes deployment file, the jobmanager gets spawned with the > correct reousrce requests and limits but the taskmanagers get spawned > with the defaults: > > Limits: > cpu: 1 > memory: 1728Mi > Requests: > cpu: 1 > memory: 1728Mi > > Is there any way I could set the requests/limits for the CPU/Memory to > different values when deploying Flink in Kubernetes? If not, would it > make sense to request this as a feature? > > Thanks in advance! > > Denis > >