Hi! I'm new to Flink, and I have been trying to run a simple python flink script that consumes messages from Kafka as well as the examples locally with a few issues.
1. When I run the word count example using `./flink-1.17.0/bin/flink run --python flink-1.17.0/examples/python/datastream/word_count.py`, I get the following error: ``` Traceback (most recent call last): File "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py", line 134, in <module> word_count(known_args.input, known_args.output) File "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py", line 89, in word_count ds = ds.flat_map(split) \ File "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 354, in flat_map File "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 654, in process File "<frozen zipimport>", line 259, in load_module File "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py", line 22, in <module> ModuleNotFoundError: No module named 'google' org.apache.flink.client.program.ProgramAbortException: java.lang.RuntimeException: Python process exits with code: 1 at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:140) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095) at org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) Caused by: java.lang.RuntimeException: Python process exits with code: 1 at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:130) ... 14 more ``` This issue happens with and without `-pyclientexec venv/bin/python3`. When running the example using `python word_count.py`, I get a similar error but the `pyflink` module is not found. I installed pyflink using `python -m pip install apache-flink` within a python venv (not a conda one) and downloaded the corresponding binary. Looking through this mailing list's archive as well as on Stack Overflow, I saw similar issues, and the fixes were usually from installing flink incorrectly. For me, Flink and its dependencies (including protobuf) are in `.../Library/Python/3.9/lib/python/site-packages/...`. I'm pretty sure I'm doing something silly, but I'm not quite sure what the fix is here. 2. When I run my Python script (consumes json-formatted messages from Kafka to a datastream and prints) using `./flink-1.17.0/bin/flink run --python kafka_consumer.py`, I see the same issue as the traceback above. When using `python kafka_consumer.py`, the program hangs on `env.execute()`. I see no running jobs on the dashboard. Is there a good way to go about debugging this? Should I be waiting a while for it to start running? Thanks in advance! - Jill