Hello everyone, I'm looking to run a Pyflink application run in a distributed fashion, using kubernetes, and am currently facing issues. I've successfully gotten a Scala Flink Application to run using the manifests provided at [0]
I attempted to run the application by updating the jobmanager command args from args: ["standalone-job", "--job-classname", "com.job.ClassName", <optional arguments>, <job arguments>] to args: ["standalone-job", "--python", "my_python_app.py", <optional arguments>, <job arguments>] But this didn't work. It resulted in the following error: Caused by: java.lang.LinkageError: loader constraint violation: loader org.apache.flink.util.ChildFirstClassLoader @2d8f2f3a wants to load class org.apache.commons.cli.Options. A different class with the same name was previously loaded by 'app'. (org.apache.commons.cli.Options is in unnamed module of loader 'app' I was able to get things to 'run' by setting args to: args: ["python", "my_python_app.py", <optional arguments>, <job arguments>] But I'm not sure if things were running in a distributed fashion or not. 1/ Is there a good way to check if the task pods were being correctly utilized? 2/ Are there any similar examples to [0] for how to run Pyflink jobs on kubernetes? Open to any suggestions you may have. Note: we'd prefer not to run using the native K8S route outlined at [1] because we need to maintain the ability to customize certain aspects of the deployment (eg. mounting SSDs to some of the pods) Thanks in advance! [0] https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/kubernetes.html#application-cluster-resource-definitions [1] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/native_kubernetes.html#application-mode