Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

richard.su Tue, 05 Dec 2023 05:34:05 -0800

Thank you, Gyula, we are working on validate setting larger  
taskmanager.memory.jvm-overhead.fraction to ease this problem, and on the other 
side, we try to find a way in deployment path to ease this problem.


I agree with you proposal, may be I could find sometime to make a pr for 
FLINK-33548 <https://issues.apache.org/jira/browse/FLINK-33548>.

Thank you for your time.

Richard Su

> 2023年12月5日 21:24，Gyula Fóra <gyula.f...@gmail.com> 写道：
> 
> I understand your problem but I think you are trying to find a solution in
> the wrong place.
> Have you tried setting taskmanager.memory.jvm-overhead.fraction ? That
> would reserve more memory from the total process memory for non-JVM use.
> 
> Gyula
> 
> On Tue, Dec 5, 2023 at 1:50 PM richard.su <richardsuc...@gmail.com> wrote:
> 
>> Sorry, "To be clear, we need a container has memory larger than request,
>> and confirm this pod has Guarantee Qos." which need to be "To be clear, we
>> need a container has memory larger than process.size, and confirm this pod
>> has Guarantee Qos."
>> 
>> Thanks.
>> 
>> Richard Su
>> 
>> 
>>> 2023年12月5日 20:47，richard.su <richardsuc...@gmail.com> 写道：
>>> 
>>> Hi, Gyula, yes, this is a special case in our scenarios, sorry about
>> that it's hard to understand,  which we want to reserved some memory beyond
>> the jobmanager or task manager's process.To be clear, we need a container
>> has memory larger than request, and confirm this pod has Guarantee Qos.
>>> 
>>> This is because we encounter the glibc problem inside container with
>> flink job using Rcoksdb, which reserved memory will help to ease this
>> problem.
>>> 
>>> So I hope the container resources's request can be decoupling from flink
>> configuration.
>>> 
>>> From flink's current implementation, this could not be done.
>>> 
>>> Thanks.
>>> 
>>> Richard Su
>>> 
>>>> 2023年12月5日 20:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>> 
>>>> Richard, I still don't understand why the current setup doesn't work for
>>>> you. According to
>>>> 
>> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup/
>>>> :
>>>> 
>>>> The process memory config (which is what we configure) translates
>> directly
>>>> into the container request size. With the new proposal you can set the
>>>> limit independently.
>>>> 
>>>> What you write doesn't make sense to me:
>>>> "user wants to define a flinkdeployment with jobmanager has 1G memory
>>>> resources in container field but config jobmanager.memory.process.size
>> as
>>>> 850m"
>>>> 
>>>> If you want to have a 1G container you set the memory request
>>>> (process.size) in the spec simply  to 1G. Then you have 1G, there are
>> other
>>>> configs on how this 1G will be split inside the container for various
>>>> purposes but these are all covered in detail by the flink memory
>> configs.
>>>> 
>>>> Cheers
>>>> Gyula
>>>> 
>>>> On Tue, Dec 5, 2023 at 1:06 PM richard.su <richardsuc...@gmail.com>
>> wrote:
>>>> 
>>>>> I think the new configuration could be :
>>>>> 
>>>>> "kubernetes.taskmanager.memory.amount" and
>>>>> "kubernetes.jobmanager.memory.amout"
>>>>> 
>>>>> once we can calculate the limit-factor by the different of requests and
>>>>> limits.
>>>>> 
>>>>> when native mode, we no longer check the process.size as default
>> memory,
>>>>> but using this configuration for decoupling logic.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Richard Su
>>>>> 
>>>>>> 2023年12月5日 19:22，richard.su <richardsuc...@gmail.com> 写道：
>>>>>> 
>>>>>> Hi, Gyula, from my opinion, this still will using flinkDeployment's
>>>>> resource filed to set jobManager.memory.process.size, and I have told
>> an
>>>>> uncovered case that:
>>>>>> 
>>>>>> When user wants to define a flinkdeployment with jobmanager has 1G
>>>>> memory resources in container field but config
>>>>> jobmanager.memory.process.size as 850m, which this solution only
>> improves
>>>>> user config and actually make sconfig more intuitive and easier but not
>>>>> make the container resource decoupling flink configuration.
>>>>>> 
>>>>>> So from my side, I think it need to add new configuration to support
>>>>> this proposal, and it need more discussion.
>>>>>> 
>>>>>> Thanks
>>>>>> Chaoran Su
>>>>>> 
>>>>>> 
>>>>>>> 2023年12月5日 18:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>> 
>>>>>>> This is the proposal according to FLINK-33548:
>>>>>>> 
>>>>>>> spec:
>>>>>>> taskManager:
>>>>>>> resources:
>>>>>>>  requests:
>>>>>>>    memory: "64Mi"
>>>>>>>    cpu: "250m"
>>>>>>>  limits:
>>>>>>>    memory: "128Mi"
>>>>>>>    cpu: "500m"
>>>>>>> 
>>>>>>> I honestly think this is much more intuitive and easier than using
>> the
>>>>>>> podTemplate, which is very complex immediately.
>>>>>>> Please tell me what use-case/setup is not covered by this improved
>> spec.
>>>>>>> 
>>>>>>> Unless there is a big limitation here I am still -1 for modifying the
>>>>>>> podTemplate logic and +1 for continuing with FLINK-33548
>>>>>>> 
>>>>>>> Gyula
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Dec 5, 2023 at 11:16 AM Surendra Singh Lilhore <
>>>>>>> surendralilh...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Hi Gyula,
>>>>>>>> 
>>>>>>>> FLINK-33548 proposes adding a new resource field to match with
>>>>> Kubernetes
>>>>>>>> pod resource configuration. Here's my suggestion: instead of adding
>> a
>>>>> new
>>>>>>>> resource field, let's use a pod template for more advanced resource
>>>>> setup.
>>>>>>>> Adding a new resource field might confuse users. This change can
>> also
>>>>> help
>>>>>>>> with issues when users use Flink Kubernetes commands directly,
>> without
>>>>> the
>>>>>>>> operator.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Surendra
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Dec 5, 2023 at 3:10 PM richard.su <richardsuc...@gmail.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Sorry Gyula,  let me explain more about the point of 2, if I avoid
>> the
>>>>>>>>> override, I will got a jobmanager pod still with resources consist
>>>>> with
>>>>>>>>> “jobmanager.memory.process.size”, but a flinkdeployment with a
>>>>> resource
>>>>>>>>> larger than that.
>>>>>>>>> 
>>>>>>>>> Thanks for your time.
>>>>>>>>> Richard Su
>>>>>>>>> 
>>>>>>>>>> 2023年12月5日 17:13，richard.su <richardsuc...@gmail.com> 写道：
>>>>>>>>>> 
>>>>>>>>>> Thank you for your time, Gyula, I have more question about
>>>>> Flink-33548,
>>>>>>>>> we can have more discussion about this and make progress:
>>>>>>>>>> 
>>>>>>>>>> 1. I agree with you about declaring resources in FlinkDeployment
>>>>>>>>> resource sections. But Flink Operator will override the
>>>>>>>>> “jobmanager.memory.process.size”  and
>>>>> "taskmanager.memory.process.size",
>>>>>>>>> despite I have set these configuration or not in flink
>> configuration.
>>>>> If
>>>>>>>>> user had configured all memory attributes, the override will leads
>> to
>>>>>>>> error
>>>>>>>>> as the overall computation is error.
>>>>>>>>>> 
>>>>>>>>>> the code of override is in FlinkConfigManager.class in buildFrom
>>>>>>>> method,
>>>>>>>>> which apply to JobmanagerSpec and TaskManagerSpec.
>>>>>>>>>> 
>>>>>>>>>> 2. If I modified the code of override, I will still encounter this
>>>>>>>> issue
>>>>>>>>> of FLINK-24150, because I only modified the code of flink operator
>> but
>>>>>>>> not
>>>>>>>>> flink-kubernetes package, so I will make a pod resources like
>> (cpu:1c
>>>>>>>>> memory:1g) and container resource to be (cpu:1c, memory 850m),
>>>>> because I
>>>>>>>>> already set jobmanager.memory.process.size to 850m.
>>>>>>>>>> 
>>>>>>>>>> 3. because of there two point, we need to make the podTemplate
>> have
>>>>>>>>> higher priority. Otherwise we can refactor the code of flink
>> operator,
>>>>>>>>> which should import something new configuration to support the
>> native
>>>>>>>> mode.
>>>>>>>>>> 
>>>>>>>>>> I think it will be better to import some configuration, which
>>>>>>>>> FlinkConfigManager.class can override it using the resource of
>>>>>>>>> JobmanagerSpec and TaskManagerSpec.
>>>>>>>>>> 
>>>>>>>>>> When it deep into the code flink-kubernetes package, we using
>> these
>>>>> new
>>>>>>>>> configuration as the final result of containers resources.
>>>>>>>>>> 
>>>>>>>>>> Thanks for your time.
>>>>>>>>>> Richard Su
>>>>>>>>>> 
>>>>>>>>>>> 2023年12月5日 16:45，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>>>>>> 
>>>>>>>>>>> As you can see in the jira ticket there hasn't been any progress,
>>>>>>>> nobody
>>>>>>>>>>> started to work on this yet.
>>>>>>>>>>> 
>>>>>>>>>>> I personally don't think it's confusing to declare resources in
>> the
>>>>>>>>>>> FlinkDeployment resource sections. It's well documented and
>> worked
>>>>>>>> very
>>>>>>>>>>> well so far for most users.
>>>>>>>>>>> This is pretty common practice for kubernetes.
>>>>>>>>>>> 
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Gyula
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Dec 5, 2023 at 9:35 AM richard.su <
>> richardsuc...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi, Gyula, is there had any progress in FLINK-33548? I would
>> like
>>>>> to
>>>>>>>>> join
>>>>>>>>>>>> the discussion but I haven't seen any discussion in the url.
>>>>>>>>>>>> 
>>>>>>>>>>>> I also make flinkdeployment by flink operator, which indeed will
>>>>>>>>> override
>>>>>>>>>>>> the process size by TaskmanagerSpec.resources or
>>>>>>>>> JobmanagerSpec.resources,
>>>>>>>>>>>> which really confused, I had modified the code of flink
>> operator to
>>>>>>>>> avoid
>>>>>>>>>>>> the override.
>>>>>>>>>>>> 
>>>>>>>>>>>> Looking for your response.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thank you.
>>>>>>>>>>>> Richard Su
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 2023年12月5日 16:22，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Please see the discussion in
>>>>>>>>>>>>> 
>> https://lists.apache.org/thread/6p5tk6obmk1qxf169so498z4vk8cg969
>>>>>>>>>>>>> and the ticket:
>> https://issues.apache.org/jira/browse/FLINK-33548
>>>>>>>>>>>>> 
>>>>>>>>>>>>> We should follow the approach outlined there. If you are
>>>>> interested
>>>>>>>>> you
>>>>>>>>>>>> are
>>>>>>>>>>>>> welcome to pick up the operator ticket.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Unfortunately your PR can be a large unexpected change to
>> existing
>>>>>>>>> users
>>>>>>>>>>>> so
>>>>>>>>>>>>> we should not add it.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Gyula
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Dec 5, 2023 at 9:05 AM 苏超腾 <richardsuc...@gmail.com>
>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I've encountered an issue while using flink kubernetes native,
>>>>>>>>> Despite
>>>>>>>>>>>>>> setting resource limits in the pod template, it appears that
>>>>> these
>>>>>>>>>>>> limits
>>>>>>>>>>>>>> and requests are not considered during JobManager(JM) and
>>>>>>>> TaskManager
>>>>>>>>>>>> (TM)
>>>>>>>>>>>>>> pod deployment.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I find the a issue had opened in jira  FLINK-24150, which
>>>>>>>> introduced
>>>>>>>>>>>>>> almost the same questions that I encountered.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I agrees that if user had provided pod templates, we should
>> put
>>>>>>>>> priority
>>>>>>>>>>>>>> on it higher than flink calculated from configuration.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> But this need some discussion in our community, because it
>>>>> related
>>>>>>>>> some
>>>>>>>>>>>>>> scenarios:
>>>>>>>>>>>>>> If I want to create a pod with Graranted QoS and want the
>> memory
>>>>> of
>>>>>>>>> the
>>>>>>>>>>>>>> Flink main container to be larger than the process size of
>>>>> Flink, I
>>>>>>>>>>>> cannot
>>>>>>>>>>>>>> directly modify podTemplate (although we can use limit factor,
>>>>> this
>>>>>>>>> will
>>>>>>>>>>>>>> cause the QoS to change from Graranted to Burstable)
>>>>>>>>>>>>>> If I want to create a pod with Burstable QoS, I don't want to
>> use
>>>>>>>>> limit
>>>>>>>>>>>>>> actor and want to directly configure the request to be 50% of
>> the
>>>>>>>>> limit,
>>>>>>>>>>>>>> which cannot be modified.
>>>>>>>>>>>>>> In order to meet these scenarios, I had committed a pull
>> request
>>>>>>>>>>>>>> https://github.com/apache/flink/pull/23872
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This code is very simple and just need someone to review,
>> this pr
>>>>>>>>> can be
>>>>>>>>>>>>>> cherry pick to other old version, which will be helpful.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I would appreciate any feedback on this.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank you for your time and contributions to the Flink
>> project.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>>> chaoran.su
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>> 
>> 
>>

Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

Reply via email to