Hi Dongwoo, Thank you very much for your response. It has been very helpful to me.
Your email mentioned the configuration of keytab and krb.file, as well as how to configure and write them into HDFS security. However, if the pod doesn't know the location of the HDFS namenode, it needs to load the hdfs-core.xml file into the Flink environment and notify the HDFS namenode to write data to HDFS. In Flink on YARN mode, we can set the "export HADOOP_CONF_DIR" environment variable, and the hdfs-core.xml file can be saved in the HADOOP_CONF_DIR. Flink can automatically detect the namenode. My main question is how to load the hdfs-core.xml file in the Flink Kubernetes operator. If you know how to do that, please let me know. I hope to receive your response via email. Thank you! ________________________________ 发件人: Dongwoo Kim <dongwoo7....@gmail.com> 发送时间: Wednesday, June 21, 2023 7:56:52 PM 收件人: 李 琳 <leili...@outlook.com> 抄送: user@flink.apache.org <user@flink.apache.org> 主题: Re: How to set hdfs configuration in flink kubernetes operator? Hi leilinee, I'm not sure whether this is the best practice but I would like to share our experience about configuring HDFS as checkpoint storage while using flink kubernetes operator. There are two steps. Step 1) Mount krb5-conf & keytab file to flink kubernetes operator pod You have to create configmap and secret for krb5.conf and keytab respectively, and apply below configs to flink kuberentes operator's values.yaml operatorVolumeMounts: create: true data: - mountPath: /opt/flink/krb5.conf name: krb5-conf subPath: krb5.conf - mountPath: /opt/flink/{keytab_file} name: custom-keytab subPath: {keytab_file} operatorVolumes: create: true data: - configMap: name: krb5-configmap name: krb5-conf - name: custom-keytab secret: secretName: custom-keytab Step 2) Configure FlinkDeployment like below in your application apiVersion: flink.apache.org/v1beta1<http://flink.apache.org/v1beta1> kind: FlinkDeployment spec: flinkConfiguration: state.checkpoint-storage: "filesystem" state.checkpoints.dir: "hdfs:{path_for_checkpoint}" security.kerberos.login.keytab: "/opt/flink/{keytab_file}" # Absolute path in flink k8s operator pod security.kerberos.login.principal: "{principal_name}" security.kerberos.relogin.period: "5m" security.kerberos.krb5-conf.path: "/opt/flink/krb5.conf" # Absolute path in flink k8s operator pod I hope this could help your work. Best regards dongwoo 2023년 6월 21일 (수) 오후 7:36, 李 琳 <leili...@outlook.com<mailto:leili...@outlook.com>>님이 작성: Hi all, Recently, I have been testing the Flink Kubernetes Operator. In the official example, the checkpoint/savepoint path is configured with a file system: state.savepoints.dir: file:///flink-data/savepoints state.checkpoints.dir: file:///flink-data/checkpoints high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory high-availability.storageDir: file:///flink-data/ha However, in our production environment, we use HDFS to store checkpoint data. I'm wondering if it's possible to store checkpoint data in the Flink Kubernetes Operator as well. If so, could you please guide me on how to set up HDFS configuration in the Flink Kubernetes Operator? I would greatly appreciate any assistance you can provide. Thank you!