Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

richard.su Tue, 05 Dec 2023 03:23:33 -0800

Hi, Gyula, from my opinion, this still will using flinkDeployment's resource 
filed to set jobManager.memory.process.size, and I have told an uncovered case 
that:


When user wants to define a flinkdeployment with jobmanager has 1G memory 
resources in container field but config jobmanager.memory.process.size as 850m, 
which this solution only improves user config and actually make sconfig more 
intuitive and easier but not make the container resource decoupling flink 
configuration.

So from my side, I think it need to add new configuration to support this 
proposal, and it need more discussion.

Thanks
Chaoran Su


> 2023年12月5日 18:28，Gyula Fóra <gyula.f...@gmail.com> 写道：
> 
> This is the proposal according to FLINK-33548:
> 
> spec:
>  taskManager:
>    resources:
>      requests:
>        memory: "64Mi"
>        cpu: "250m"
>      limits:
>        memory: "128Mi"
>        cpu: "500m"
> 
> I honestly think this is much more intuitive and easier than using the
> podTemplate, which is very complex immediately.
> Please tell me what use-case/setup is not covered by this improved spec.
> 
> Unless there is a big limitation here I am still -1 for modifying the
> podTemplate logic and +1 for continuing with FLINK-33548
> 
> Gyula
> 
> 
> 
> On Tue, Dec 5, 2023 at 11:16 AM Surendra Singh Lilhore <
> surendralilh...@gmail.com> wrote:
> 
>> Hi Gyula,
>> 
>> FLINK-33548 proposes adding a new resource field to match with Kubernetes
>> pod resource configuration. Here's my suggestion: instead of adding a new
>> resource field, let's use a pod template for more advanced resource setup.
>> Adding a new resource field might confuse users. This change can also help
>> with issues when users use Flink Kubernetes commands directly, without the
>> operator.
>> 
>> Thanks
>> Surendra
>> 
>> 
>> On Tue, Dec 5, 2023 at 3:10 PM richard.su <richardsuc...@gmail.com> wrote:
>> 
>>> Sorry Gyula,  let me explain more about the point of 2, if I avoid the
>>> override, I will got a jobmanager pod still with resources consist with
>>> “jobmanager.memory.process.size”, but a flinkdeployment with a resource
>>> larger than that.
>>> 
>>> Thanks for your time.
>>> Richard Su
>>> 
>>>> 2023年12月5日 17:13，richard.su <richardsuc...@gmail.com> 写道：
>>>> 
>>>> Thank you for your time, Gyula, I have more question about Flink-33548,
>>> we can have more discussion about this and make progress:
>>>> 
>>>> 1. I agree with you about declaring resources in FlinkDeployment
>>> resource sections. But Flink Operator will override the
>>> “jobmanager.memory.process.size”  and "taskmanager.memory.process.size",
>>> despite I have set these configuration or not in flink configuration. If
>>> user had configured all memory attributes, the override will leads to
>> error
>>> as the overall computation is error.
>>>> 
>>>> the code of override is in FlinkConfigManager.class in buildFrom
>> method,
>>> which apply to JobmanagerSpec and TaskManagerSpec.
>>>> 
>>>> 2. If I modified the code of override, I will still encounter this
>> issue
>>> of FLINK-24150, because I only modified the code of flink operator but
>> not
>>> flink-kubernetes package, so I will make a pod resources like (cpu:1c
>>> memory:1g) and container resource to be (cpu:1c, memory 850m), because I
>>> already set jobmanager.memory.process.size to 850m.
>>>> 
>>>> 3. because of there two point, we need to make the podTemplate have
>>> higher priority. Otherwise we can refactor the code of flink operator,
>>> which should import something new configuration to support the native
>> mode.
>>>> 
>>>> I think it will be better to import some configuration, which
>>> FlinkConfigManager.class can override it using the resource of
>>> JobmanagerSpec and TaskManagerSpec.
>>>> 
>>>> When it deep into the code flink-kubernetes package, we using these new
>>> configuration as the final result of containers resources.
>>>> 
>>>> Thanks for your time.
>>>> Richard Su
>>>> 
>>>>> 2023年12月5日 16:45，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>> 
>>>>> As you can see in the jira ticket there hasn't been any progress,
>> nobody
>>>>> started to work on this yet.
>>>>> 
>>>>> I personally don't think it's confusing to declare resources in the
>>>>> FlinkDeployment resource sections. It's well documented and worked
>> very
>>>>> well so far for most users.
>>>>> This is pretty common practice for kubernetes.
>>>>> 
>>>>> Cheers,
>>>>> Gyula
>>>>> 
>>>>> On Tue, Dec 5, 2023 at 9:35 AM richard.su <richardsuc...@gmail.com>
>>> wrote:
>>>>> 
>>>>>> Hi, Gyula, is there had any progress in FLINK-33548? I would like to
>>> join
>>>>>> the discussion but I haven't seen any discussion in the url.
>>>>>> 
>>>>>> I also make flinkdeployment by flink operator, which indeed will
>>> override
>>>>>> the process size by TaskmanagerSpec.resources or
>>> JobmanagerSpec.resources,
>>>>>> which really confused, I had modified the code of flink operator to
>>> avoid
>>>>>> the override.
>>>>>> 
>>>>>> Looking for your response.
>>>>>> 
>>>>>> Thank you.
>>>>>> Richard Su
>>>>>> 
>>>>>> 
>>>>>>> 2023年12月5日 16:22，Gyula Fóra <gyula.f...@gmail.com> 写道：
>>>>>>> 
>>>>>>> Hi!
>>>>>>> 
>>>>>>> Please see the discussion in
>>>>>>> https://lists.apache.org/thread/6p5tk6obmk1qxf169so498z4vk8cg969
>>>>>>> and the ticket: https://issues.apache.org/jira/browse/FLINK-33548
>>>>>>> 
>>>>>>> We should follow the approach outlined there. If you are interested
>>> you
>>>>>> are
>>>>>>> welcome to pick up the operator ticket.
>>>>>>> 
>>>>>>> Unfortunately your PR can be a large unexpected change to existing
>>> users
>>>>>> so
>>>>>>> we should not add it.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Gyula
>>>>>>> 
>>>>>>> On Tue, Dec 5, 2023 at 9:05 AM 苏超腾 <richardsuc...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Hello everyone,
>>>>>>>> 
>>>>>>>> I've encountered an issue while using flink kubernetes native,
>>> Despite
>>>>>>>> setting resource limits in the pod template, it appears that these
>>>>>> limits
>>>>>>>> and requests are not considered during JobManager(JM) and
>> TaskManager
>>>>>> (TM)
>>>>>>>> pod deployment.
>>>>>>>> 
>>>>>>>> I find the a issue had opened in jira  FLINK-24150, which
>> introduced
>>>>>>>> almost the same questions that I encountered.
>>>>>>>> 
>>>>>>>> I agrees that if user had provided pod templates, we should put
>>> priority
>>>>>>>> on it higher than flink calculated from configuration.
>>>>>>>> 
>>>>>>>> But this need some discussion in our community, because it related
>>> some
>>>>>>>> scenarios:
>>>>>>>> If I want to create a pod with Graranted QoS and want the memory of
>>> the
>>>>>>>> Flink main container to be larger than the process size of Flink, I
>>>>>> cannot
>>>>>>>> directly modify podTemplate (although we can use limit factor, this
>>> will
>>>>>>>> cause the QoS to change from Graranted to Burstable)
>>>>>>>> If I want to create a pod with Burstable QoS, I don't want to use
>>> limit
>>>>>>>> actor and want to directly configure the request to be 50% of the
>>> limit,
>>>>>>>> which cannot be modified.
>>>>>>>> In order to meet these scenarios, I had committed a pull request
>>>>>>>> https://github.com/apache/flink/pull/23872
>>>>>>>> 
>>>>>>>> This code is very simple and just need someone to review, this pr
>>> can be
>>>>>>>> cherry pick to other old version, which will be helpful.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I would appreciate any feedback on this.
>>>>>>>> 
>>>>>>>> Thank you for your time and contributions to the Flink project.
>>>>>>>> 
>>>>>>>> Thank you,
>>>>>>>> chaoran.su
>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 
>>

Re: Discussion: [FLINK-24150] Support to configure cpu resource request and limit in pod template

Reply via email to