[ https://issues.apache.org/jira/browse/FLINK-38047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-38047: ----------------------------------- Labels: dependency pull-request-available (was: dependency) > Bump cert-manager in the Kubernetes Operator > -------------------------------------------- > > Key: FLINK-38047 > URL: https://issues.apache.org/jira/browse/FLINK-38047 > Project: Flink > Issue Type: Technical Debt > Components: Kubernetes Operator > Reporter: Kumar Mallikarjuna > Priority: Major > Labels: dependency, pull-request-available > > Flink Kubernetes Operator currently use cert-manager:{_}v1.8.2{_} in the > [CI|https://github.com/apache/flink-kubernetes-operator/blob/main/e2e-tests/cert-manager.yaml] > and recommends the same in > [docs|https://github.com/apache/flink-kubernetes-operator/blob/8812c78cd6a2c0ad1b672ca08a8b880bd890ae8b/docs/content/docs/try-flink-kubernetes-operator/quick-start.md?plain=1#L69-L72]. > The latest stable release _v1.18.2_ is ten minor versions ahead. We should > bump the recommendations and tests to the latest release. > > *Validation for _cert-manager:v1.18.2_ with > _flink-kubernetes-operator:v1.12.0_* > 1. Start a kind cluster > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ kind create cluster > Creating cluster "kind" ... > ✓ Ensuring node image (kindest/node:v1.32.2) 🖼 > ✓ Preparing nodes 📦 > ✓ Writing configuration 📜 > ✓ Starting control-plane 🕹️ > ✓ Installing CNI 🔌 > ✓ Installing StorageClass 💾 > Set kubectl context to "kind-kind" > You can now use your cluster with:kubectl cluster-info --context > kind-kindHave a nice day! 👋 > {code} > > 2. Install cert-manager v1.18.2 > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ kubectl create -f > https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.yaml > namespace/cert-manager created > customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io > created > customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io > created > customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io > created > customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io > created > customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created > customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io > created > serviceaccount/cert-manager-cainjector created > serviceaccount/cert-manager created > serviceaccount/cert-manager-webhook created > clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers > created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates > created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges > created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim > created > clusterrole.rbac.authorization.k8s.io/cert-manager-cluster-view created > clusterrole.rbac.authorization.k8s.io/cert-manager-view created > clusterrole.rbac.authorization.k8s.io/cert-manager-edit created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io > created > clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests > created > clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests > created > clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews > created > role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created > role.rbac.authorization.k8s.io/cert-manager:leaderelection created > role.rbac.authorization.k8s.io/cert-manager-tokenrequest created > role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created > rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection > created > rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created > rolebinding.rbac.authorization.k8s.io/cert-manager-cert-manager-tokenrequest > created > rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving > created > service/cert-manager-cainjector created > service/cert-manager created > service/cert-manager-webhook created > deployment.apps/cert-manager-cainjector created > deployment.apps/cert-manager created > deployment.apps/cert-manager-webhook created > mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook > created > validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook > created > {code} > > 3. Wait for cert-manager to be ready > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k -n cert-manager get po > NAME READY STATUS RESTARTS AGE > cert-manager-69f748766f-28s8d 1/1 Running 0 44s > cert-manager-cainjector-7cf6557c49-gdfd7 1/1 Running 0 44s > cert-manager-webhook-58f4cff74d-kz4pc 1/1 Running 0 44s > {code} > > 4. Install flink-kubernetes-operator > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ helm install > flink-kubernetes-operator flink-operator-repo/flink-kubernetes-operator > W0704 14:33:26.593488 51760 warnings.go:70] spec.privateKey.rotationPolicy: > In cert-manager >= v1.18.0, the default value changed from `Never` to > `Always`. > NAME: flink-kubernetes-operator > LAST DEPLOYED: Fri Jul 4 14:33:25 2025 > NAMESPACE: default > STATUS: deployed > REVISION: 1 > TEST SUITE: None{code} > > *Note:* The warning about _spec.privateKey.rotationPolicy_ is expected and > can be ignored since it does not affect the functionality of the > operator/webhook. > > 5. Verify the operator/webhook are running > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k get po > NAME READY STATUS RESTARTS AGE > flink-kubernetes-operator-7dc7858566-42g5z 2/2 Running 0 > 112s{code} > > 6. Test with a sample FlinkDeployment > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k create -f examples/basic.yaml > flinkdeployment.flink.apache.org/basic-example created > > ➜ flink-kubernetes-operator git:(main) ✗ k get > flinkdeployments.flink.apache.org > NAME JOB STATUS LIFECYCLE STATE > basic-example RUNNING STABLE > ➜ flink-kubernetes-operator git:(main) ✗ k get po > NAME READY STATUS RESTARTS AGE > basic-example-6c7bff5c68-w669x 1/1 Running 0 70s > basic-example-taskmanager-1-1 1/1 Running 0 23s > flink-kubernetes-operator-7dc7858566-42g5z 2/2 Running 0 > 3m27s{code} > > 7. Clean up the FlinkDeployment > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k delete > flinkdeployments.flink.apache.org basic-example > flinkdeployment.flink.apache.org "basic-example" deleted {code} > > 8. Force rotate the certificate > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k get certificate > NAME READY SECRET AGE > flink-operator-serving-cert True webhook-server-cert 4m48s > ➜ flink-kubernetes-operator git:(main) ✗ k get certificate > flink-operator-serving-cert -oyaml > apiVersion: cert-manager.io/v1 > kind: Certificate > metadata: > annotations: > meta.helm.sh/release-name: flink-kubernetes-operator > meta.helm.sh/release-namespace: default > creationTimestamp: "2025-07-04T09:03:26Z" > generation: 1 > labels: > app.kubernetes.io/managed-by: Helm > name: flink-operator-serving-cert > namespace: default > resourceVersion: "997" > uid: b0e1935c-eab8-4b61-ad9f-7bb0bf166c07 > spec: > commonName: FlinkDeployment Validator > dnsNames: > - flink-operator-webhook-service.default.svc > - flink-operator-webhook-service.default.svc.cluster.local > issuerRef: > kind: Issuer > name: flink-operator-selfsigned-issuer > keystores: > pkcs12: > create: true > passwordSecretRef: > key: password > name: flink-operator-webhook-secret > secretName: webhook-server-cert > status: > conditions: > - lastTransitionTime: "2025-07-04T09:03:26Z" > message: Certificate is up to date and has not expired > observedGeneration: 1 > reason: Ready > status: "True" > type: Ready > notAfter: "2025-10-02T09:03:26Z" > notBefore: "2025-07-04T09:03:26Z" > renewalTime: "2025-09-02T09:03:26Z" > revision: 1 > ➜ flink-kubernetes-operator git:(main) ✗ cmctl renew > flink-operator-serving-cert > Manually triggered issuance of Certificate default/flink-operator-serving-cert > ➜ flink-kubernetes-operator git:(main) ✗ k get certificate > flink-operator-serving-cert -oyaml > apiVersion: cert-manager.io/v1 > kind: Certificate > metadata: > annotations: > meta.helm.sh/release-name: flink-kubernetes-operator > meta.helm.sh/release-namespace: default > creationTimestamp: "2025-07-04T09:03:26Z" > generation: 1 > labels: > app.kubernetes.io/managed-by: Helm > name: flink-operator-serving-cert > namespace: default > resourceVersion: "1591" > uid: b0e1935c-eab8-4b61-ad9f-7bb0bf166c07 > spec: > commonName: FlinkDeployment Validator > dnsNames: > - flink-operator-webhook-service.default.svc > - flink-operator-webhook-service.default.svc.cluster.local > issuerRef: > kind: Issuer > name: flink-operator-selfsigned-issuer > keystores: > pkcs12: > create: true > passwordSecretRef: > key: password > name: flink-operator-webhook-secret > secretName: webhook-server-cert > status: > conditions: > - lastTransitionTime: "2025-07-04T09:03:26Z" > message: Certificate is up to date and has not expired > observedGeneration: 1 > reason: Ready > status: "True" > type: Ready > notAfter: "2025-10-02T09:08:37Z" > notBefore: "2025-07-04T09:08:37Z" > renewalTime: "2025-09-02T09:08:37Z" > revision: 2 {code} > > 9. Verify the operator/webhook are still running > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k get po > NAME READY STATUS RESTARTS AGE > flink-kubernetes-operator-7dc7858566-42g5z 2/2 Running 0 > 5m50s {code} > > 10. Check logs for the webhook and verify if the certificate was reloaded > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k logs > flink-kubernetes-operator-7dc7858566-42g5z -c flink-webhook | tail -20 > 2025-07-04 09:03:57,113 o.a.f.k.o.f.FileSystemWatchService [INFO ] Starting > watching path: /certs > 2025-07-04 09:03:57,117 o.a.f.k.o.f.FileSystemWatchService [INFO ] Path is > resolved to real path: /certs > 2025-07-04 09:03:57,186 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] Webhook > listening at 0:0:0:0:0:0:0:0:9443 > 2025-07-04 09:08:47,807 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] Reloading > SSL context because of certificate change > 2025-07-04 09:08:47,809 o.a.f.k.o.s.ReloadableSslContext [INFO ] Creating > keystore with type: pkcs12 > 2025-07-04 09:08:47,810 o.a.f.k.o.s.ReloadableSslContext [INFO ] Loading > keystore from file: /certs/keystore.p12 > 2025-07-04 09:08:47,816 o.a.f.k.o.s.ReloadableSslContext [INFO ] Initializing > key manager with keystore and password > 2025-07-04 09:08:47,821 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] SSL context > reloaded successfully > 2025-07-04 09:08:56,977 o.a.f.c.GlobalConfiguration [INFO ] Using legacy > YAML parser to load flink configuration file from > /opt/flink/conf/flink-conf.yaml. > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: parallelism.default, 1 > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: taskmanager.numberOfTaskSlots, 1 > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: > kubernetes.operator.default-configuration.flink-version.v1_18.env.java.opts.all, > --add-exports=java.base/sun.net.util=ALL-UNNAMED > --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED > --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED > --add-opens=java.base/java.lang=ALL-UNNAMED > --add-opens=java.base/java.net=ALL-UNNAMED > --add-opens=java.base/java.io=ALL-UNNAMED > --add-opens=java.base/java.nio=ALL-UNNAMED > --add-opens=java.base/sun.nio.ch=ALL-UNNAMED > --add-opens=java.base/java.lang.reflect=ALL-UNNAMED > --add-opens=java.base/java.text=ALL-UNNAMED > --add-opens=java.base/java.time=ALL-UNNAMED > --add-opens=java.base/java.util=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: kubernetes.operator.reconcile.interval, 15 s > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: > kubernetes.operator.default-configuration.flink-version.v1_19+.env.java.default-opts.all, > --add-exports=java.base/sun.net.util=ALL-UNNAMED > --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED > --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED > --add-opens=java.base/java.lang=ALL-UNNAMED > --add-opens=java.base/java.net=ALL-UNNAMED > --add-opens=java.base/java.io=ALL-UNNAMED > --add-opens=java.base/java.nio=ALL-UNNAMED > --add-opens=java.base/sun.nio.ch=ALL-UNNAMED > --add-opens=java.base/java.lang.reflect=ALL-UNNAMED > --add-opens=java.base/java.text=ALL-UNNAMED > --add-opens=java.base/java.time=ALL-UNNAMED > --add-opens=java.base/java.util=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED > --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED > 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: kubernetes.operator.metrics.reporter.slf4j.interval, > 5 MINUTE > 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: kubernetes.operator.observer.progress-check.interval, > 5 s > 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: kubernetes.operator.health.probe.enabled, true > 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: kubernetes.operator.health.probe.port, 8085 > 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration [INFO ] Loading > configuration property: > kubernetes.operator.metrics.reporter.slf4j.factory.class, > org.apache.flink.metrics.slf4j.Slf4jReporterFactory > 2025-07-04 09:08:56,984 o.a.f.k.o.c.FlinkConfigManager [INFO ] Default > configuration did not change, nothing to do... {code} > > 11. Create a resource to test the webhook > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k create -f examples/basic.yaml > flinkdeployment.flink.apache.org/basic-example created {code} > > 12. Check the resource status > {code:java} > ➜ flink-kubernetes-operator git:(main) ✗ k get > flinkdeployments.flink.apache.org > NAME JOB STATUS LIFECYCLE STATE > basic-example RUNNING STABLE > ➜ flink-kubernetes-operator git:(main) ✗ k get po > NAME READY STATUS RESTARTS AGE > basic-example-6c7bff5c68-gmlh2 1/1 Running 0 25s > basic-example-taskmanager-1-1 1/1 Running 0 14s > flink-kubernetes-operator-7dc7858566-42g5z 2/2 Running 0 > 7m28s {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)