[ 
https://issues.apache.org/jira/browse/FLINK-38047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-38047:
-----------------------------------
    Labels: dependency pull-request-available  (was: dependency)

> Bump cert-manager in the Kubernetes Operator
> --------------------------------------------
>
>                 Key: FLINK-38047
>                 URL: https://issues.apache.org/jira/browse/FLINK-38047
>             Project: Flink
>          Issue Type: Technical Debt
>          Components: Kubernetes Operator
>            Reporter: Kumar Mallikarjuna
>            Priority: Major
>              Labels: dependency, pull-request-available
>
> Flink Kubernetes Operator currently use cert-manager:{_}v1.8.2{_} in the 
> [CI|https://github.com/apache/flink-kubernetes-operator/blob/main/e2e-tests/cert-manager.yaml]
>  and recommends the same in 
> [docs|https://github.com/apache/flink-kubernetes-operator/blob/8812c78cd6a2c0ad1b672ca08a8b880bd890ae8b/docs/content/docs/try-flink-kubernetes-operator/quick-start.md?plain=1#L69-L72].
>  The latest stable release _v1.18.2_ is ten minor versions ahead. We should 
> bump the recommendations and tests to the latest release.
>  
> *Validation for _cert-manager:v1.18.2_ with 
> _flink-kubernetes-operator:v1.12.0_*
> 1. Start a kind cluster
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ kind create cluster
> Creating cluster "kind" ...
>  ✓ Ensuring node image (kindest/node:v1.32.2) 🖼
>  ✓ Preparing nodes 📦
>  ✓ Writing configuration 📜
>  ✓ Starting control-plane 🕹️
>  ✓ Installing CNI 🔌
>  ✓ Installing StorageClass 💾
> Set kubectl context to "kind-kind"
> You can now use your cluster with:kubectl cluster-info --context 
> kind-kindHave a nice day! 👋
> {code}
>  
> 2. Install cert-manager v1.18.2
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ kubectl create -f 
> https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.yaml
> namespace/cert-manager created
> customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io
>  created
> customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io 
> created
> customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io 
> created
> customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io 
> created
> customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
> customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io 
> created
> serviceaccount/cert-manager-cainjector created
> serviceaccount/cert-manager created
> serviceaccount/cert-manager-webhook created
> clusterrole.rbac.authorization.k8s.io/cert-manager-cainjector created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-issuers created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers 
> created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificates 
> created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-orders created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-challenges 
> created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim 
> created
> clusterrole.rbac.authorization.k8s.io/cert-manager-cluster-view created
> clusterrole.rbac.authorization.k8s.io/cert-manager-view created
> clusterrole.rbac.authorization.k8s.io/cert-manager-edit created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io
>  created
> clusterrole.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests
>  created
> clusterrole.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-cainjector created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-issuers 
> created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-clusterissuers
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificates
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-orders 
> created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-challenges
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-ingress-shim
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-approve:cert-manager-io
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-controller-certificatesigningrequests
>  created
> clusterrolebinding.rbac.authorization.k8s.io/cert-manager-webhook:subjectaccessreviews
>  created
> role.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection created
> role.rbac.authorization.k8s.io/cert-manager:leaderelection created
> role.rbac.authorization.k8s.io/cert-manager-tokenrequest created
> role.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving created
> rolebinding.rbac.authorization.k8s.io/cert-manager-cainjector:leaderelection 
> created
> rolebinding.rbac.authorization.k8s.io/cert-manager:leaderelection created
> rolebinding.rbac.authorization.k8s.io/cert-manager-cert-manager-tokenrequest 
> created
> rolebinding.rbac.authorization.k8s.io/cert-manager-webhook:dynamic-serving 
> created
> service/cert-manager-cainjector created
> service/cert-manager created
> service/cert-manager-webhook created
> deployment.apps/cert-manager-cainjector created
> deployment.apps/cert-manager created
> deployment.apps/cert-manager-webhook created
> mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook
>  created
> validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook
>  created
> {code}
>  
> 3. Wait for cert-manager to be ready
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k -n cert-manager get po
> NAME                                       READY   STATUS    RESTARTS   AGE
> cert-manager-69f748766f-28s8d              1/1     Running   0          44s
> cert-manager-cainjector-7cf6557c49-gdfd7   1/1     Running   0          44s
> cert-manager-webhook-58f4cff74d-kz4pc      1/1     Running   0          44s 
> {code}
>  
> 4. Install flink-kubernetes-operator
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ helm install 
> flink-kubernetes-operator flink-operator-repo/flink-kubernetes-operator
> W0704 14:33:26.593488   51760 warnings.go:70] spec.privateKey.rotationPolicy: 
> In cert-manager >= v1.18.0, the default value changed from `Never` to 
> `Always`.
> NAME: flink-kubernetes-operator
> LAST DEPLOYED: Fri Jul  4 14:33:25 2025
> NAMESPACE: default
> STATUS: deployed
> REVISION: 1
> TEST SUITE: None{code}
>  
> *Note:* The warning about _spec.privateKey.rotationPolicy_ is expected and 
> can be ignored since it does not affect the functionality of the 
> operator/webhook.
>  
> 5. Verify the operator/webhook are running
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k get po
> NAME                                         READY   STATUS    RESTARTS   AGE
> flink-kubernetes-operator-7dc7858566-42g5z   2/2     Running   0          
> 112s{code}
>  
> 6. Test with a sample FlinkDeployment
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k create -f examples/basic.yaml
> flinkdeployment.flink.apache.org/basic-example created
>  
> ➜  flink-kubernetes-operator git:(main) ✗ k get 
> flinkdeployments.flink.apache.org
> NAME            JOB STATUS   LIFECYCLE STATE
> basic-example   RUNNING      STABLE
> ➜  flink-kubernetes-operator git:(main) ✗ k get po
> NAME                                         READY   STATUS    RESTARTS   AGE
> basic-example-6c7bff5c68-w669x               1/1     Running   0          70s
> basic-example-taskmanager-1-1                1/1     Running   0          23s
> flink-kubernetes-operator-7dc7858566-42g5z   2/2     Running   0          
> 3m27s{code}
>  
> 7. Clean up the FlinkDeployment
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k delete 
> flinkdeployments.flink.apache.org basic-example
> flinkdeployment.flink.apache.org "basic-example" deleted {code}
>  
> 8. Force rotate the certificate
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k get certificate
> NAME                          READY   SECRET                AGE
> flink-operator-serving-cert   True    webhook-server-cert   4m48s
> ➜  flink-kubernetes-operator git:(main) ✗ k get certificate 
> flink-operator-serving-cert -oyaml
> apiVersion: cert-manager.io/v1
> kind: Certificate
> metadata:
>   annotations:
>     meta.helm.sh/release-name: flink-kubernetes-operator
>     meta.helm.sh/release-namespace: default
>   creationTimestamp: "2025-07-04T09:03:26Z"
>   generation: 1
>   labels:
>     app.kubernetes.io/managed-by: Helm
>   name: flink-operator-serving-cert
>   namespace: default
>   resourceVersion: "997"
>   uid: b0e1935c-eab8-4b61-ad9f-7bb0bf166c07
> spec:
>   commonName: FlinkDeployment Validator
>   dnsNames:
>   - flink-operator-webhook-service.default.svc
>   - flink-operator-webhook-service.default.svc.cluster.local
>   issuerRef:
>     kind: Issuer
>     name: flink-operator-selfsigned-issuer
>   keystores:
>     pkcs12:
>       create: true
>       passwordSecretRef:
>         key: password
>         name: flink-operator-webhook-secret
>   secretName: webhook-server-cert
> status:
>   conditions:
>   - lastTransitionTime: "2025-07-04T09:03:26Z"
>     message: Certificate is up to date and has not expired
>     observedGeneration: 1
>     reason: Ready
>     status: "True"
>     type: Ready
>   notAfter: "2025-10-02T09:03:26Z"
>   notBefore: "2025-07-04T09:03:26Z"
>   renewalTime: "2025-09-02T09:03:26Z"
>   revision: 1
> ➜  flink-kubernetes-operator git:(main) ✗ cmctl renew 
> flink-operator-serving-cert
> Manually triggered issuance of Certificate default/flink-operator-serving-cert
> ➜  flink-kubernetes-operator git:(main) ✗ k get certificate 
> flink-operator-serving-cert -oyaml
> apiVersion: cert-manager.io/v1
> kind: Certificate
> metadata:
>   annotations:
>     meta.helm.sh/release-name: flink-kubernetes-operator
>     meta.helm.sh/release-namespace: default
>   creationTimestamp: "2025-07-04T09:03:26Z"
>   generation: 1
>   labels:
>     app.kubernetes.io/managed-by: Helm
>   name: flink-operator-serving-cert
>   namespace: default
>   resourceVersion: "1591"
>   uid: b0e1935c-eab8-4b61-ad9f-7bb0bf166c07
> spec:
>   commonName: FlinkDeployment Validator
>   dnsNames:
>   - flink-operator-webhook-service.default.svc
>   - flink-operator-webhook-service.default.svc.cluster.local
>   issuerRef:
>     kind: Issuer
>     name: flink-operator-selfsigned-issuer
>   keystores:
>     pkcs12:
>       create: true
>       passwordSecretRef:
>         key: password
>         name: flink-operator-webhook-secret
>   secretName: webhook-server-cert
> status:
>   conditions:
>   - lastTransitionTime: "2025-07-04T09:03:26Z"
>     message: Certificate is up to date and has not expired
>     observedGeneration: 1
>     reason: Ready
>     status: "True"
>     type: Ready
>   notAfter: "2025-10-02T09:08:37Z"
>   notBefore: "2025-07-04T09:08:37Z"
>   renewalTime: "2025-09-02T09:08:37Z"
>   revision: 2 {code}
>  
> 9. Verify the operator/webhook are still running
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k get po
> NAME                                         READY   STATUS    RESTARTS   AGE
> flink-kubernetes-operator-7dc7858566-42g5z   2/2     Running   0          
> 5m50s {code}
>  
> 10. Check logs for the webhook and verify if the certificate was reloaded
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k logs 
> flink-kubernetes-operator-7dc7858566-42g5z -c flink-webhook | tail -20
> 2025-07-04 09:03:57,113 o.a.f.k.o.f.FileSystemWatchService [INFO ] Starting 
> watching path: /certs
> 2025-07-04 09:03:57,117 o.a.f.k.o.f.FileSystemWatchService [INFO ] Path is 
> resolved to real path: /certs
> 2025-07-04 09:03:57,186 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] Webhook 
> listening at 0:0:0:0:0:0:0:0:9443
> 2025-07-04 09:08:47,807 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] Reloading 
> SSL context because of certificate change
> 2025-07-04 09:08:47,809 o.a.f.k.o.s.ReloadableSslContext [INFO ] Creating 
> keystore with type: pkcs12
> 2025-07-04 09:08:47,810 o.a.f.k.o.s.ReloadableSslContext [INFO ] Loading 
> keystore from file: /certs/keystore.p12
> 2025-07-04 09:08:47,816 o.a.f.k.o.s.ReloadableSslContext [INFO ] Initializing 
> key manager with keystore and password
> 2025-07-04 09:08:47,821 o.a.f.k.o.a.FlinkOperatorWebhook [INFO ] SSL context 
> reloaded successfully
> 2025-07-04 09:08:56,977 o.a.f.c.GlobalConfiguration    [INFO ] Using legacy 
> YAML parser to load flink configuration file from 
> /opt/flink/conf/flink-conf.yaml.
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: parallelism.default, 1
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: taskmanager.numberOfTaskSlots, 1
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: 
> kubernetes.operator.default-configuration.flink-version.v1_18.env.java.opts.all,
>  --add-exports=java.base/sun.net.util=ALL-UNNAMED 
> --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED 
> --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED 
> --add-opens=java.base/java.lang=ALL-UNNAMED 
> --add-opens=java.base/java.net=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED 
> --add-opens=java.base/java.nio=ALL-UNNAMED 
> --add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
> --add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
> --add-opens=java.base/java.text=ALL-UNNAMED 
> --add-opens=java.base/java.time=ALL-UNNAMED 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: kubernetes.operator.reconcile.interval, 15 s
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: 
> kubernetes.operator.default-configuration.flink-version.v1_19+.env.java.default-opts.all,
>  --add-exports=java.base/sun.net.util=ALL-UNNAMED 
> --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED 
> --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED 
> --add-opens=java.base/java.lang=ALL-UNNAMED 
> --add-opens=java.base/java.net=ALL-UNNAMED 
> --add-opens=java.base/java.io=ALL-UNNAMED 
> --add-opens=java.base/java.nio=ALL-UNNAMED 
> --add-opens=java.base/sun.nio.ch=ALL-UNNAMED 
> --add-opens=java.base/java.lang.reflect=ALL-UNNAMED 
> --add-opens=java.base/java.text=ALL-UNNAMED 
> --add-opens=java.base/java.time=ALL-UNNAMED 
> --add-opens=java.base/java.util=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED 
> --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED
> 2025-07-04 09:08:56,982 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: kubernetes.operator.metrics.reporter.slf4j.interval, 
> 5 MINUTE
> 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: kubernetes.operator.observer.progress-check.interval, 
> 5 s
> 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: kubernetes.operator.health.probe.enabled, true
> 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: kubernetes.operator.health.probe.port, 8085
> 2025-07-04 09:08:56,983 o.a.f.c.GlobalConfiguration    [INFO ] Loading 
> configuration property: 
> kubernetes.operator.metrics.reporter.slf4j.factory.class, 
> org.apache.flink.metrics.slf4j.Slf4jReporterFactory
> 2025-07-04 09:08:56,984 o.a.f.k.o.c.FlinkConfigManager [INFO ] Default 
> configuration did not change, nothing to do... {code}
>  
> 11. Create a resource to test the webhook
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k create -f examples/basic.yaml
> flinkdeployment.flink.apache.org/basic-example created {code}
>  
> 12. Check the resource status
> {code:java}
> ➜  flink-kubernetes-operator git:(main) ✗ k get 
> flinkdeployments.flink.apache.org
> NAME            JOB STATUS   LIFECYCLE STATE
> basic-example   RUNNING      STABLE
> ➜  flink-kubernetes-operator git:(main) ✗ k get po
> NAME                                         READY   STATUS    RESTARTS   AGE
> basic-example-6c7bff5c68-gmlh2               1/1     Running   0          25s
> basic-example-taskmanager-1-1                1/1     Running   0          14s
> flink-kubernetes-operator-7dc7858566-42g5z   2/2     Running   0          
> 7m28s {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to