[ 
https://issues.apache.org/jira/browse/FLINK-17090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151233#comment-17151233
 ] 

Canbin Zheng commented on FLINK-17090:
--------------------------------------

[~chesnay]

I think we could introduce such checks inĀ {{KubernetesJobManagerParameters}} or 
{{KubernetesTaskManagerParameters}}.

> Harden preckeck for the KubernetesConfigOptions.JOB_MANAGER_CPU
> ---------------------------------------------------------------
>
>                 Key: FLINK-17090
>                 URL: https://issues.apache.org/jira/browse/FLINK-17090
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>    Affects Versions: 1.10.0
>            Reporter: Canbin Zheng
>            Priority: Minor
>             Fix For: 1.11.0
>
>
> If people specify a negative value for the config option of 
> {{KubernetesConfigOptions#JOB_MANAGER_CPU}} as what the following command 
> does,
> {code:java}
> ./bin/kubernetes-session.sh -Dkubernetes.jobmanager.cpu=-3.0 
> -Dkubernetes.cluster-id=...{code}
> then it will throw an exception as follows:
> {quote}org.apache.flink.client.deployment.ClusterDeploymentException: Could 
> not create Kubernetes cluster "felix1".
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:192)
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deploySessionCluster(KubernetesClusterDescriptor.java:129)
>  at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:108)
>  at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
>  at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>  at 
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure 
> executing: POST at: 
> [https://cls-cf5wqdwy.ccs.tencent-cloud.com/apis/apps/v1/namespaces/default/deployments].
>  Message: Deployment.apps "felix1" is invalid: 
> [spec.template.spec.containers[0].resources.limits[cpu]: Invalid value: "-3": 
> must be greater than or equal to 0, 
> spec.template.spec.containers[0].resources.requests[cpu]: Invalid value: 
> "-3": must be greater than or equal to 0]. Received status: 
> Status(apiVersion=v1, code=422, 
> details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].resources.limits[cpu],
>  message=Invalid value: "-3": must be greater than or equal to 0, 
> reason=FieldValueInvalid, additionalProperties={}), 
> StatusCause(field=spec.template.spec.containers[0].resources.requests[cpu], 
> message=Invalid value: "-3": must be greater than or equal to 0, 
> reason=FieldValueInvalid, additionalProperties={})], group=apps, 
> kind=Deployment, name=felix1, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=Deployment.apps "felix1" is 
> invalid: [spec.template.spec.containers[0].resources.limits[cpu]: Invalid 
> value: "-3": must be greater than or equal to 0, 
> spec.template.spec.containers[0].resources.requests[cpu]: Invalid value: 
> "-3": must be greater than or equal to 0], metadata=ListMeta(_continue=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=Invalid, status=Failure, additionalProperties={}).
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
>  at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:241)
>  at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:798)
>  at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:328)
>  at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:324)
>  at 
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.createJobManagerComponent(Fabric8FlinkKubeClient.java:83)
>  at 
> org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:182)
> {quote}
> Since there is a gap in the configuration model between the flink-side and 
> the k8s-side, this ticket proposes to harden precheck in the flink k8s 
> parameters parsing tool and throw a more user-friendly exception message like 
> "the value of {{kubernetes.jobmanager.cpu}} must be greater than or equal to 
> 0".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to