Hello everyone,

I'm looking to run a Pyflink application run in a distributed fashion,
using kubernetes, and am currently facing issues. I've successfully gotten
a Scala Flink Application to run using the manifests provided at [0]

I attempted to run the application by updating the jobmanager command args
from

 args: ["standalone-job", "--job-classname", "com.job.ClassName",
<optional arguments>, <job arguments>]

to

args: ["standalone-job", "--python", "my_python_app.py", <optional
arguments>, <job arguments>]

But this didn't work. It resulted in the following error:

Caused by: java.lang.LinkageError: loader constraint violation: loader
org.apache.flink.util.ChildFirstClassLoader @2d8f2f3a wants to load class
org.apache.commons.cli.Options. A different class with the same name was
previously loaded by 'app'. (org.apache.commons.cli.Options is in unnamed
module of loader 'app'

I was able to get things to 'run' by setting args to:

args: ["python", "my_python_app.py", <optional arguments>, <job arguments>]


But I'm not sure if things were running in a distributed fashion or not.

1/ Is there a good way to check if the task pods were being correctly
utilized?

2/ Are there any similar examples to [0] for how to run Pyflink jobs on
kubernetes?

Open to any suggestions you may have. Note: we'd prefer not to run using
the native K8S route outlined at [1] because we need to maintain the
ability to customize certain aspects of the deployment (eg. mounting SSDs
to some of the pods)

Thanks in advance!

[0]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/kubernetes.html#application-cluster-resource-definitions

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/native_kubernetes.html#application-mode

Reply via email to