hjw created FLINK-31203:
---------------------------
Summary: Application upgrade rollbacks failed in Flink Kubernetes
Operator
Key: FLINK-31203
URL: https://issues.apache.org/jira/browse/FLINK-31203
Project: Flink
Issue Type: Bug
Components: Kubernetes Operator
Affects Versions: kubernetes-operator-1.3.1
Reporter: hjw
I make a test on the Application upgrade rollback feature, but this function
fails.The Flink application mode job cannot roll back to last stable spec.
As shown in the follow example, I declare a error pod-template without a
container named flink-main-container to test rollback feature.
However, only the error of deploying the flink application job failed without
rollback.
Error:
org.apache.flink.client.deployment.ClusterDeploymentException: Could not create
Kubernetes cluster "basic-example".
at
org.apache.flink.kubernetes.KubernetesClusterDescriptor.deployClusterInternal(KubernetesClusterDescriptor.java:292)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure
executing: POST at:
https://*/k8s/clusters/c-fwkxh/apis/apps/v1/namespaces/test-flink/deployments.
Message: Deployment.apps "basic-example" is invalid:
[spec.template.spec.containers[0].name: Required value,
spec.template.spec.containers[0].image: Required value]. Received status:
Status(apiVersion=v1, code=422,
details=StatusDetails(causes=[StatusCause(field=spec.template.spec.containers[0].name,
message=Required value, reason=FieldValueRequired, additionalProperties={}),
StatusCause(field=spec.template.spec.containers[0].image, message=Required
value, reason=FieldValueRequired, additionalProperties={})], group=apps,
kind=Deployment, name=basic-example, retryAfterSeconds=null, uid=null,
additionalProperties={}), kind=Status, message=Deployment.apps "basic-example"
is invalid: [spec.template.spec.containers[0].name: Required value,
spec.template.spec.containers[0].image: Required value],
metadata=ListMeta(_continue=null, remainingItemCount=null,
resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid,
status=Failure, additionalProperties={}).
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:673)
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:612)
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:560)
Env:
Flink version:Flink 1.16
Flink Kubernetes Operator:1.3.1
*Last* ** *stable spec:*
apiVersion: [flink.apache.org/v1beta1|http://flink.apache.org/v1beta1]
kind: FlinkDeployment
metadata:
name: basic-example
spec:
image: flink:1.16
flinkVersion: v1_16
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
kubernetes.operator.deployment.rollback.enabled: true
state.savepoints.dir: s3://flink-data/savepoints
state.checkpoints.dir: s3://flink-data/checkpoints
high-availability:
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: s3://flink-data/ha
serviceAccount: flink
*podTemplate:*
*spec:*
*containers:*
*- name: flink-main-container*
*env:*
*- name: TZ*
*value: Asia/Shanghai*
jobManager:
resource:
memory: "2048m"
cpu: 1
taskManager:
resource:
memory: "2048m"
cpu: 1
job:
jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 2
upgradeMode: stateless
*new Spec:*
apiVersion: [flink.apache.org/v1beta1|http://flink.apache.org/v1beta1]
kind: FlinkDeployment
metadata:
name: basic-example
spec:
image: flink:1.16
flinkVersion: v1_16
flinkConfiguration:
taskmanager.numberOfTaskSlots: "2"
kubernetes.operator.deployment.rollback.enabled: true
state.savepoints.dir: s3://flink-data/savepoints
state.checkpoints.dir: s3://flink-data/checkpoints
high-availability:
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: s3://flink-data/ha
serviceAccount: flink
*podTemplate:*
*spec:*
*containers:*
*- env:*
*- name: TZ*
*value: Asia/Shanghai*
jobManager:
resource:
memory: "2048m"
cpu: 1
taskManager:
resource:
memory: "2048m"
cpu: 1
job:
jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 2
upgradeMode: stateless
--
This message was sent by Atlassian Jira
(v8.20.10#820010)