Hi Jill, I suspect that the PyFlink isn't installed in the Python environment which is used to run the example. Could you share the complete command you used to execute the example: `./flink-1.17.0/bin/flink run -pyclientexec venv/bin/python3 --python flink-1.17.0/examples/python/ datastream/word_count.py`. I think this is in-complete.
Regards, Dian On Fri, May 12, 2023 at 2:36 AM Jill Cardamon <jill.carda...@gmail.com> wrote: > Hi! > > I'm new to Flink, and I have been trying to run a simple python flink > script that consumes messages from Kafka as well as the examples locally > with a few issues. > > 1. When I run the word count example using `./flink-1.17.0/bin/flink run > --python flink-1.17.0/examples/python/datastream/word_count.py`, I get the > following error: > > ``` > Traceback (most recent call last): > File > "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py", line > 134, in <module> > word_count(known_args.input, known_args.output) > File > "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py", line > 89, in word_count > ds = ds.flat_map(split) \ > File > "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", > line 354, in flat_map > File > "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", > line 654, in process > File "<frozen zipimport>", line 259, in load_module > File > "/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py", > line 22, in <module> > ModuleNotFoundError: No module named 'google' > org.apache.flink.client.program.ProgramAbortException: > java.lang.RuntimeException: Python process exits with code: 1 > at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:140) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) > at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245) > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095) > at > org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189) > at > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) > at > org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) > Caused by: java.lang.RuntimeException: Python process exits with code: 1 > at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:130) > ... 14 more > ``` > > This issue happens with and without `-pyclientexec venv/bin/python3`. When > running the example using `python word_count.py`, I get a similar error but > the `pyflink` module is not found. I installed pyflink using `python -m pip > install apache-flink` within a python venv (not a conda one) and downloaded > the corresponding binary. Looking through this mailing list's archive as > well as on Stack Overflow, I saw similar issues, and the fixes were usually > from installing flink incorrectly. For me, Flink and its dependencies > (including protobuf) are in > `.../Library/Python/3.9/lib/python/site-packages/...`. I'm pretty sure I'm > doing something silly, but I'm not quite sure what the fix is here. > > 2. When I run my Python script (consumes json-formatted messages from > Kafka to a datastream and prints) using `./flink-1.17.0/bin/flink run > --python kafka_consumer.py`, I see the same issue as the traceback above. > When using `python kafka_consumer.py`, the program hangs on > `env.execute()`. I see no running jobs on the dashboard. Is there a good > way to go about debugging this? Should I be waiting a while for it to start > running? > > Thanks in advance! > > - Jill > > > >