Hi I cannot see the difference between the two configurations, but the error info `Failure executing: POST at: https://*/k8s/clusters/c- fwkxh/apis/apps/v1/namespaces/test-flink/deployments. Message: Deployment.apps "basic-example" is invalid` is strange. Maybe you can check whether the configuration of k8s has changed?
Best, Shammon On Mon, Feb 20, 2023 at 12:56 AM hjw <[email protected]> wrote: > I make a test on the Application upgrade rollback feature, but this > function fails.The Flink application mode job cannot roll back to last > stable spec. > As shown in the follow example, I declare a error pod-template without a > container named flink-main-container to test rollback feature. > However, only the error of deploying the flink application job failed > without rollback. > > Error: > org.apache.flink.client.deployment.ClusterDeploymentException: Could not > create Kubernetes cluster "basic-example". > at > org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:292) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure > executing: POST at: > https://*/k8s/clusters/c-fwkxh/apis/apps/v1/namespaces/test-flink/deployments. > Message: Deployment.apps "basic-example" is invalid: > [spec.template.spec.containers[0].name: Required value, > spec.template.spec.containers[0].image: Required value]. Received status: > Status(apiVersion=v1, code=422, > details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].name, > message=Required value, reason=FieldValueRequired, > additionalProperties={}), > StatusCause(field=spec.template.spec.containers[0].image, message=Required > value, reason=FieldValueRequired, additionalProperties={})], group=apps, > kind=Deployment, name=flink-bdra-sql-application-job-s3p, > retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, > message=Deployment.apps "flink-bdra-sql-application-job-s3p" is invalid: > [spec.template.spec.containers[0].name: Required value, > spec.template.spec.containers[0].image: Required value], > metadata=ListMeta(_continue=null, remainingItemCount=null, > resourceVersion=null, selfLink=null, additionalProperties={}), > reason=Invalid, status=Failure, additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560) > > Env: > Flink version:Flink 1.16 > Flink Kubernetes Operator:1.3.1 > > *Last* *stable spec:* > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > name: basic-example > spec: > image: flink:1.16 > flinkVersion: v1_16 > flinkConfiguration: > taskmanager.numberOfTaskSlots: "2" > kubernetes.operator.deployment.rollback.enabled: true > state.savepoints.dir: s3:///flink-data/savepoints > state.checkpoints.dir: s3:///flink-data/checkpoints > high-availability: > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory > high-availability.storageDir: s3:///flink-data/ha > serviceAccount: flink > podTemplate: > spec: > containers: > - name: flink-main-container > env: > - name: TZ > value: Asia/Shanghai > jobManager: > resource: > memory: "2048m" > cpu: 1 > taskManager: > resource: > memory: "2048m" > cpu: 1 > job: > jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar > parallelism: 2 > upgradeMode: stateless > > *new Spec:* > apiVersion: flink.apache.org/v1beta1 > kind: FlinkDeployment > metadata: > name: basic-example > spec: > image: flink:1.16 > flinkVersion: v1_16 > flinkConfiguration: > taskmanager.numberOfTaskSlots: "2" > kubernetes.operator.deployment.rollback.enabled: true > state.savepoints.dir: s3:///flink-data/savepoints > state.checkpoints.dir: s3:///flink-data/checkpoints > high-availability: > org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory > high-availability.storageDir: s3:///flink-data/ha > serviceAccount: flink > podTemplate: > spec: > containers: > - env: > - name: TZ > value: Asia/Shanghai > jobManager: > resource: > memory: "2048m" > cpu: 1 > taskManager: > resource: > memory: "2048m" > cpu: 1 > job: > jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar > parallelism: 2 > upgradeMode: stateless > > -- > Best, > Hjw >
