Hi, I'm trying to integrate HDFS in Flink for checkpoint and savepoint storage. I have an HDFS cluster running on Docker. I have made the following changes to the flink configuration:
state.backend: filesystem state.savepoints.dir: hdfs://namenode:9000/user/root/savepoints state.checkpoints.dir: hdfs://namenode:9000/user/root/checkpoints The docker-compose.yml file contains the services for setting up the Hadoop cluster as well the Flink jobmanager and taskmanger containers. When I try to run the job, I get the errors: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded. org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies. How do I add Hadoop to the classpath? Note that I do not have Hadoop on the Flink jobmanager or taskmanager containers. Do I need to install it or does Flink have some built-in mechanisms? Version info: PyFlink on Flink 1.16 Hadoop 3.2.1 -- Secure Link Services Group Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland Munich: Tal 44, 80331 München, Germany Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab Emirates Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan Visit us: www.selise.ch <http://www.selise.ch> -- *Important Note: This e-mail and any attachment are confidential and may contain trade secrets and may well also be legally privileged or otherwise protected from disclosure. If you have received it in error, you are on notice of its status. Please notify us immediately by reply e-mail and then delete this e-mail and any attachment from your system. If you are not the intended recipient please understand that you must not copy this e-mail or any attachment or disclose the contents to any other person. Thank you for your cooperation.*