Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

richard.su Tue, 05 Dec 2023 04:49:15 -0800

Sorry, "To be clear, we need a container has memory larger than request, and 
confirm this pod has Guarantee Qos." which need to be "To be clear, we need a 
container has memory larger than process.size, and confirm this pod has 
Guarantee Qos."


Thanks.

Richard Su


> 2023年12月5日 20:47，richard.su <richardsuc...@gmail.com> 写道：
> 
> Hi, Gyula, yes, this is a special case in our scenarios, sorry about that 
> it's hard to understand,  which we want to reserved some memory beyond the 
> jobmanager or task manager's process.To be clear, we need a container has 
> memory larger than request, and confirm this pod has Guarantee Qos.
> 
> This is because we encounter the glibc problem inside container with flink 
> job using Rcoksdb, which reserved memory will help to ease this problem.
> 
> So I hope the container resources's request can be decoupling from flink 
> configuration.
> 
> From flink's current implementation, this could not be done.
> 
> Thanks.
> 
> Richard Su
> 
>> 2023年12月5日 20:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
>> 
>> Richard, I still don't understand why the current setup doesn't work for
>> you. According to
>> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup/
>> :
>> 
>> The process memory config (which is what we configure) translates directly
>> into the container request size. With the new proposal you can set the
>> limit independently.
>> 
>> What you write doesn't make sense to me:
>> "user wants to define a flinkdeployment with jobmanager has 1G memory
>> resources in container field but config jobmanager.memory.process.size as
>> 850m"
>> 
>> If you want to have a 1G container you set the memory request
>> (process.size) in the spec simply  to 1G. Then you have 1G, there are other
>> configs on how this 1G will be split inside the container for various
>> purposes but these are all covered in detail by the flink memory configs.
>> 
>> Cheers
>> Gyula
>> 
>> On Tue, Dec 5, 2023 at 1:06 PM richard.su <richardsuc...@gmail.com> wrote:
>> 
>>> I think the new configuration could be :
>>> 
>>> "kubernetes.taskmanager.memory.amount" and
>>> "kubernetes.jobmanager.memory.amout"
>>> 
>>> once we can calculate the limit-factor by the different of requests and
>>> limits.
>>> 
>>> when native mode, we no longer check the process.size as default memory,
>>> but using this configuration for decoupling logic.
>>> 
>>> Thanks
>>> 
>>> Richard Su
>>> 
>>>> 2023年12月5日 19:22，richard.su <richardsuc...@gmail.com> 写道：
>>>> 
>>>> Hi, Gyula, from my opinion, this still will using flinkDeployment's
>>> resource filed to set jobManager.memory.process.size, and I have told an
>>> uncovered case that:
>>>> 
>>>> When user wants to define a flinkdeployment with jobmanager has 1G
>>> memory resources in container field but config
>>> jobmanager.memory.process.size as 850m, which this solution only improves
>>> user config and actually make sconfig more intuitive and easier but not
>>> make the container resource decoupling flink configuration.
>>>> 
>>>> So from my side, I think it need to add new configuration to support
>>> this proposal, and it need more discussion.
>>>> 
>>>> Thanks
>>>> Chaoran Su
>>>> 
>>>> 
>>>>> 2023年12月5日 18:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>> 
>>>>> This is the proposal according to FLINK-33548:
>>>>> 
>>>>> spec:
>>>>> taskManager:
>>>>> resources:
>>>>>   requests:
>>>>>     memory: "64Mi"
>>>>>     cpu: "250m"
>>>>>   limits:
>>>>>     memory: "128Mi"
>>>>>     cpu: "500m"
>>>>> 
>>>>> I honestly think this is much more intuitive and easier than using the
>>>>> podTemplate, which is very complex immediately.
>>>>> Please tell me what use-case/setup is not covered by this improved spec.
>>>>> 
>>>>> Unless there is a big limitation here I am still -1 for modifying the
>>>>> podTemplate logic and +1 for continuing with FLINK-33548
>>>>> 
>>>>> Gyula
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Dec 5, 2023 at 11:16 AM Surendra Singh Lilhore <
>>>>> surendralilh...@gmail.com> wrote:
>>>>> 
>>>>>> Hi Gyula,
>>>>>> 
>>>>>> FLINK-33548 proposes adding a new resource field to match with
>>> Kubernetes
>>>>>> pod resource configuration. Here's my suggestion: instead of adding a
>>> new
>>>>>> resource field, let's use a pod template for more advanced resource
>>> setup.
>>>>>> Adding a new resource field might confuse users. This change can also
>>> help
>>>>>> with issues when users use Flink Kubernetes commands directly, without
>>> the
>>>>>> operator.
>>>>>> 
>>>>>> Thanks
>>>>>> Surendra
>>>>>> 
>>>>>> 
>>>>>> On Tue, Dec 5, 2023 at 3:10 PM richard.su <richardsuc...@gmail.com>
>>> wrote:
>>>>>> 
>>>>>>> Sorry Gyula,  let me explain more about the point of 2, if I avoid the
>>>>>>> override, I will got a jobmanager pod still with resources consist
>>> with
>>>>>>> “jobmanager.memory.process.size”, but a flinkdeployment with a
>>> resource
>>>>>>> larger than that.
>>>>>>> 
>>>>>>> Thanks for your time.
>>>>>>> Richard Su
>>>>>>> 
>>>>>>>> 2023年12月5日 17:13，richard.su <richardsuc...@gmail.com> 写道：
>>>>>>>> 
>>>>>>>> Thank you for your time, Gyula, I have more question about
>>> Flink-33548,
>>>>>>> we can have more discussion about this and make progress:
>>>>>>>> 
>>>>>>>> 1. I agree with you about declaring resources in FlinkDeployment
>>>>>>> resource sections. But Flink Operator will override the
>>>>>>> “jobmanager.memory.process.size”  and
>>> "taskmanager.memory.process.size",
>>>>>>> despite I have set these configuration or not in flink configuration.
>>> If
>>>>>>> user had configured all memory attributes, the override will leads to
>>>>>> error
>>>>>>> as the overall computation is error.
>>>>>>>> 
>>>>>>>> the code of override is in FlinkConfigManager.class in buildFrom
>>>>>> method,
>>>>>>> which apply to JobmanagerSpec and TaskManagerSpec.
>>>>>>>> 
>>>>>>>> 2. If I modified the code of override, I will still encounter this
>>>>>> issue
>>>>>>> of FLINK-24150, because I only modified the code of flink operator but
>>>>>> not
>>>>>>> flink-kubernetes package, so I will make a pod resources like (cpu:1c
>>>>>>> memory:1g) and container resource to be (cpu:1c, memory 850m),
>>> because I
>>>>>>> already set jobmanager.memory.process.size to 850m.
>>>>>>>> 
>>>>>>>> 3. because of there two point, we need to make the podTemplate have
>>>>>>> higher priority. Otherwise we can refactor the code of flink operator,
>>>>>>> which should import something new configuration to support the native
>>>>>> mode.
>>>>>>>> 
>>>>>>>> I think it will be better to import some configuration, which
>>>>>>> FlinkConfigManager.class can override it using the resource of
>>>>>>> JobmanagerSpec and TaskManagerSpec.
>>>>>>>> 
>>>>>>>> When it deep into the code flink-kubernetes package, we using these
>>> new
>>>>>>> configuration as the final result of containers resources.
>>>>>>>> 
>>>>>>>> Thanks for your time.
>>>>>>>> Richard Su
>>>>>>>> 
>>>>>>>>> 2023年12月5日 16:45，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>>>> 
>>>>>>>>> As you can see in the jira ticket there hasn't been any progress,
>>>>>> nobody
>>>>>>>>> started to work on this yet.
>>>>>>>>> 
>>>>>>>>> I personally don't think it's confusing to declare resources in the
>>>>>>>>> FlinkDeployment resource sections. It's well documented and worked
>>>>>> very
>>>>>>>>> well so far for most users.
>>>>>>>>> This is pretty common practice for kubernetes.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Gyula
>>>>>>>>> 
>>>>>>>>> On Tue, Dec 5, 2023 at 9:35 AM richard.su <richardsuc...@gmail.com>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi, Gyula, is there had any progress in FLINK-33548? I would like
>>> to
>>>>>>> join
>>>>>>>>>> the discussion but I haven't seen any discussion in the url.
>>>>>>>>>> 
>>>>>>>>>> I also make flinkdeployment by flink operator, which indeed will
>>>>>>> override
>>>>>>>>>> the process size by TaskmanagerSpec.resources or
>>>>>>> JobmanagerSpec.resources,
>>>>>>>>>> which really confused, I had modified the code of flink operator to
>>>>>>> avoid
>>>>>>>>>> the override.
>>>>>>>>>> 
>>>>>>>>>> Looking for your response.
>>>>>>>>>> 
>>>>>>>>>> Thank you.
>>>>>>>>>> Richard Su
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 2023年12月5日 16:22，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>>>>>> 
>>>>>>>>>>> Hi!
>>>>>>>>>>> 
>>>>>>>>>>> Please see the discussion in
>>>>>>>>>>> https://lists.apache.org/thread/6p5tk6obmk1qxf169so498z4vk8cg969
>>>>>>>>>>> and the ticket: https://issues.apache.org/jira/browse/FLINK-33548
>>>>>>>>>>> 
>>>>>>>>>>> We should follow the approach outlined there. If you are
>>> interested
>>>>>>> you
>>>>>>>>>> are
>>>>>>>>>>> welcome to pick up the operator ticket.
>>>>>>>>>>> 
>>>>>>>>>>> Unfortunately your PR can be a large unexpected change to existing
>>>>>>> users
>>>>>>>>>> so
>>>>>>>>>>> we should not add it.
>>>>>>>>>>> 
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Gyula
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Dec 5, 2023 at 9:05 AM 苏超腾 <richardsuc...@gmail.com>
>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>> 
>>>>>>>>>>>> I've encountered an issue while using flink kubernetes native,
>>>>>>> Despite
>>>>>>>>>>>> setting resource limits in the pod template, it appears that
>>> these
>>>>>>>>>> limits
>>>>>>>>>>>> and requests are not considered during JobManager(JM) and
>>>>>> TaskManager
>>>>>>>>>> (TM)
>>>>>>>>>>>> pod deployment.
>>>>>>>>>>>> 
>>>>>>>>>>>> I find the a issue had opened in jira  FLINK-24150, which
>>>>>> introduced
>>>>>>>>>>>> almost the same questions that I encountered.
>>>>>>>>>>>> 
>>>>>>>>>>>> I agrees that if user had provided pod templates, we should put
>>>>>>> priority
>>>>>>>>>>>> on it higher than flink calculated from configuration.
>>>>>>>>>>>> 
>>>>>>>>>>>> But this need some discussion in our community, because it
>>> related
>>>>>>> some
>>>>>>>>>>>> scenarios:
>>>>>>>>>>>> If I want to create a pod with Graranted QoS and want the memory
>>> of
>>>>>>> the
>>>>>>>>>>>> Flink main container to be larger than the process size of
>>> Flink, I
>>>>>>>>>> cannot
>>>>>>>>>>>> directly modify podTemplate (although we can use limit factor,
>>> this
>>>>>>> will
>>>>>>>>>>>> cause the QoS to change from Graranted to Burstable)
>>>>>>>>>>>> If I want to create a pod with Burstable QoS, I don't want to use
>>>>>>> limit
>>>>>>>>>>>> actor and want to directly configure the request to be 50% of the
>>>>>>> limit,
>>>>>>>>>>>> which cannot be modified.
>>>>>>>>>>>> In order to meet these scenarios, I had committed a pull request
>>>>>>>>>>>> https://github.com/apache/flink/pull/23872
>>>>>>>>>>>> 
>>>>>>>>>>>> This code is very simple and just need someone to review, this pr
>>>>>>> can be
>>>>>>>>>>>> cherry pick to other old version, which will be helpful.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> I would appreciate any feedback on this.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thank you for your time and contributions to the Flink project.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>> chaoran.su
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>

Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

Reply via email to