Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

richard.su Tue, 05 Dec 2023 04:06:21 -0800

I think the new configuration could be :

"kubernetes.taskmanager.memory.amount" and "kubernetes.jobmanager.memory.amout"


once we can calculate the limit-factor by the different of requests and limits.

when native mode, we no longer check the process.size as default memory, but 
using this configuration for decoupling logic.

Thanks

Richard Su

> 2023年12月5日 19:22，richard.su <richardsuc...@gmail.com> 写道：
> 
> Hi, Gyula, from my opinion, this still will using flinkDeployment's resource 
> filed to set jobManager.memory.process.size, and I have told an uncovered 
> case that:
> 
> When user wants to define a flinkdeployment with jobmanager has 1G memory 
> resources in container field but config jobmanager.memory.process.size as 
> 850m, which this solution only improves user config and actually make sconfig 
> more intuitive and easier but not make the container resource decoupling 
> flink configuration.
> 
> So from my side, I think it need to add new configuration to support this 
> proposal, and it need more discussion.
> 
> Thanks
> Chaoran Su
> 
> 
>> 2023年12月5日 18:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
>> 
>> This is the proposal according to FLINK-33548:
>> 
>> spec:
>> taskManager:
>>   resources:
>>     requests:
>>       memory: "64Mi"
>>       cpu: "250m"
>>     limits:
>>       memory: "128Mi"
>>       cpu: "500m"
>> 
>> I honestly think this is much more intuitive and easier than using the
>> podTemplate, which is very complex immediately.
>> Please tell me what use-case/setup is not covered by this improved spec.
>> 
>> Unless there is a big limitation here I am still -1 for modifying the
>> podTemplate logic and +1 for continuing with FLINK-33548
>> 
>> Gyula
>> 
>> 
>> 
>> On Tue, Dec 5, 2023 at 11:16 AM Surendra Singh Lilhore <
>> surendralilh...@gmail.com> wrote:
>> 
>>> Hi Gyula,
>>> 
>>> FLINK-33548 proposes adding a new resource field to match with Kubernetes
>>> pod resource configuration. Here's my suggestion: instead of adding a new
>>> resource field, let's use a pod template for more advanced resource setup.
>>> Adding a new resource field might confuse users. This change can also help
>>> with issues when users use Flink Kubernetes commands directly, without the
>>> operator.
>>> 
>>> Thanks
>>> Surendra
>>> 
>>> 
>>> On Tue, Dec 5, 2023 at 3:10 PM richard.su <richardsuc...@gmail.com> wrote:
>>> 
>>>> Sorry Gyula,  let me explain more about the point of 2, if I avoid the
>>>> override, I will got a jobmanager pod still with resources consist with
>>>> “jobmanager.memory.process.size”, but a flinkdeployment with a resource
>>>> larger than that.
>>>> 
>>>> Thanks for your time.
>>>> Richard Su
>>>> 
>>>>> 2023年12月5日 17:13，richard.su <richardsuc...@gmail.com> 写道：
>>>>> 
>>>>> Thank you for your time, Gyula, I have more question about Flink-33548,
>>>> we can have more discussion about this and make progress:
>>>>> 
>>>>> 1. I agree with you about declaring resources in FlinkDeployment
>>>> resource sections. But Flink Operator will override the
>>>> “jobmanager.memory.process.size”  and "taskmanager.memory.process.size",
>>>> despite I have set these configuration or not in flink configuration. If
>>>> user had configured all memory attributes, the override will leads to
>>> error
>>>> as the overall computation is error.
>>>>> 
>>>>> the code of override is in FlinkConfigManager.class in buildFrom
>>> method,
>>>> which apply to JobmanagerSpec and TaskManagerSpec.
>>>>> 
>>>>> 2. If I modified the code of override, I will still encounter this
>>> issue
>>>> of FLINK-24150, because I only modified the code of flink operator but
>>> not
>>>> flink-kubernetes package, so I will make a pod resources like (cpu:1c
>>>> memory:1g) and container resource to be (cpu:1c, memory 850m), because I
>>>> already set jobmanager.memory.process.size to 850m.
>>>>> 
>>>>> 3. because of there two point, we need to make the podTemplate have
>>>> higher priority. Otherwise we can refactor the code of flink operator,
>>>> which should import something new configuration to support the native
>>> mode.
>>>>> 
>>>>> I think it will be better to import some configuration, which
>>>> FlinkConfigManager.class can override it using the resource of
>>>> JobmanagerSpec and TaskManagerSpec.
>>>>> 
>>>>> When it deep into the code flink-kubernetes package, we using these new
>>>> configuration as the final result of containers resources.
>>>>> 
>>>>> Thanks for your time.
>>>>> Richard Su
>>>>> 
>>>>>> 2023年12月5日 16:45，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>> 
>>>>>> As you can see in the jira ticket there hasn't been any progress,
>>> nobody
>>>>>> started to work on this yet.
>>>>>> 
>>>>>> I personally don't think it's confusing to declare resources in the
>>>>>> FlinkDeployment resource sections. It's well documented and worked
>>> very
>>>>>> well so far for most users.
>>>>>> This is pretty common practice for kubernetes.
>>>>>> 
>>>>>> Cheers,
>>>>>> Gyula
>>>>>> 
>>>>>> On Tue, Dec 5, 2023 at 9:35 AM richard.su <richardsuc...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>>> Hi, Gyula, is there had any progress in FLINK-33548? I would like to
>>>> join
>>>>>>> the discussion but I haven't seen any discussion in the url.
>>>>>>> 
>>>>>>> I also make flinkdeployment by flink operator, which indeed will
>>>> override
>>>>>>> the process size by TaskmanagerSpec.resources or
>>>> JobmanagerSpec.resources,
>>>>>>> which really confused, I had modified the code of flink operator to
>>>> avoid
>>>>>>> the override.
>>>>>>> 
>>>>>>> Looking for your response.
>>>>>>> 
>>>>>>> Thank you.
>>>>>>> Richard Su
>>>>>>> 
>>>>>>> 
>>>>>>>> 2023年12月5日 16:22，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>>> 
>>>>>>>> Hi!
>>>>>>>> 
>>>>>>>> Please see the discussion in
>>>>>>>> https://lists.apache.org/thread/6p5tk6obmk1qxf169so498z4vk8cg969
>>>>>>>> and the ticket: https://issues.apache.org/jira/browse/FLINK-33548
>>>>>>>> 
>>>>>>>> We should follow the approach outlined there. If you are interested
>>>> you
>>>>>>> are
>>>>>>>> welcome to pick up the operator ticket.
>>>>>>>> 
>>>>>>>> Unfortunately your PR can be a large unexpected change to existing
>>>> users
>>>>>>> so
>>>>>>>> we should not add it.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Gyula
>>>>>>>> 
>>>>>>>> On Tue, Dec 5, 2023 at 9:05 AM 苏超腾 <richardsuc...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Hello everyone,
>>>>>>>>> 
>>>>>>>>> I've encountered an issue while using flink kubernetes native,
>>>> Despite
>>>>>>>>> setting resource limits in the pod template, it appears that these
>>>>>>> limits
>>>>>>>>> and requests are not considered during JobManager(JM) and
>>> TaskManager
>>>>>>> (TM)
>>>>>>>>> pod deployment.
>>>>>>>>> 
>>>>>>>>> I find the a issue had opened in jira  FLINK-24150, which
>>> introduced
>>>>>>>>> almost the same questions that I encountered.
>>>>>>>>> 
>>>>>>>>> I agrees that if user had provided pod templates, we should put
>>>> priority
>>>>>>>>> on it higher than flink calculated from configuration.
>>>>>>>>> 
>>>>>>>>> But this need some discussion in our community, because it related
>>>> some
>>>>>>>>> scenarios:
>>>>>>>>> If I want to create a pod with Graranted QoS and want the memory of
>>>> the
>>>>>>>>> Flink main container to be larger than the process size of Flink, I
>>>>>>> cannot
>>>>>>>>> directly modify podTemplate (although we can use limit factor, this
>>>> will
>>>>>>>>> cause the QoS to change from Graranted to Burstable)
>>>>>>>>> If I want to create a pod with Burstable QoS, I don't want to use
>>>> limit
>>>>>>>>> actor and want to directly configure the request to be 50% of the
>>>> limit,
>>>>>>>>> which cannot be modified.
>>>>>>>>> In order to meet these scenarios, I had committed a pull request
>>>>>>>>> https://github.com/apache/flink/pull/23872
>>>>>>>>> 
>>>>>>>>> This code is very simple and just need someone to review, this pr
>>>> can be
>>>>>>>>> cherry pick to other old version, which will be helpful.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I would appreciate any feedback on this.
>>>>>>>>> 
>>>>>>>>> Thank you for your time and contributions to the Flink project.
>>>>>>>>> 
>>>>>>>>> Thank you,
>>>>>>>>> chaoran.su
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>

Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

Reply via email to