Hi!

I'm new to Flink, and I have been trying to run a simple python flink
script that consumes messages from Kafka as well as the examples locally
with a few issues.

1. When I run the word count example using `./flink-1.17.0/bin/flink run
--python flink-1.17.0/examples/python/datastream/word_count.py`, I get the
following error:

```
Traceback (most recent call last):
  File "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py",
line 134, in <module>
    word_count(known_args.input, known_args.output)
  File "/Users/jill/flink-1.17.0/examples/python/datastream/word_count.py",
line 89, in word_count
    ds = ds.flat_map(split) \
  File
"/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py",
line 354, in flat_map
  File
"/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/datastream/data_stream.py",
line 654, in process
  File "<frozen zipimport>", line 259, in load_module
  File
"/Users/jill/flink-1.17.0/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py",
line 22, in <module>
ModuleNotFoundError: No module named 'google'
org.apache.flink.client.program.ProgramAbortException:
java.lang.RuntimeException: Python process exits with code: 1
at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:140)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
at
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:105)
at
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:851)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:245)
at
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1095)
at
org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189)
at
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
at
org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157)
Caused by: java.lang.RuntimeException: Python process exits with code: 1
at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:130)
... 14 more
```

This issue happens with and without `-pyclientexec venv/bin/python3`. When
running the example using `python word_count.py`, I get a similar error but
the `pyflink` module is not found. I installed pyflink using `python -m pip
install apache-flink` within a python venv (not a conda one) and downloaded
the corresponding binary. Looking through this mailing list's archive as
well as on Stack Overflow, I saw similar issues, and the fixes were usually
from installing flink incorrectly. For me, Flink and its dependencies
(including protobuf) are in
`.../Library/Python/3.9/lib/python/site-packages/...`. I'm pretty sure I'm
doing something silly, but I'm not quite sure what the fix is here.

2. When I run my Python script (consumes json-formatted messages from Kafka
to a datastream and prints) using `./flink-1.17.0/bin/flink run --python
kafka_consumer.py`, I see the same issue as the traceback above. When using
`python kafka_consumer.py`, the program hangs on `env.execute()`. I see no
running jobs on the dashboard. Is there a good way to go about debugging
this? Should I be waiting a while for it to start running?

Thanks in advance!

- Jill

Reply via email to