Hi, I'm new to PyFlink, and I couldn't run a basic example that shipped with Flink. This is the command I tried:
./bin/flink run -py examples/python/datastream/word_count.py Here below are the results I got with different setups: 1. On AWS EMR 6.8.0 (Flink 1.15.1): *Error: No module named 'google'*I tried with the Flink shipped with EMR, or the binary v1.15.1/v1.15.2 downloaded from Flink site. I got that same error message in all cases. Traceback (most recent call last): File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/tmp/pyflink/622f19cc-6616-45ba-b7cb-20a1462c4c3f/a29eb7a4-fff1-4d98-9b38-56b25f97b70a/word_count.py", line 134, in <module> word_count(known_args.input, known_args.output) File "/tmp/pyflink/622f19cc-6616-45ba-b7cb-20a1462c4c3f/a29eb7a4-fff1-4d98-9b38-56b25f97b70a/word_count.py", line 89, in word_count ds = ds.flat_map(split) \ File "/usr/lib/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 333, in flat_map File "/usr/lib/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 557, in process File "/usr/lib/flink/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py", line 23, in <module> ModuleNotFoundError: No module named 'google' org.apache.flink.client.program.ProgramAbortException: java.lang.RuntimeException: Python process exits with code: 1 2. On my Mac, without a virtual environment (so `*-pyclientexec python3*` is included in the run command): got the same error as with EMR, but there's a stdout line from `*print()*` in the Python script File "/Users/lvh/dev/tmp/flink-1.15.2/opt/python/pyflink.zip/pyflink/datastream/data_stream.py", line 557, in process File "<frozen zipimport>", line 259, in load_module File "/Users/lvh/dev/tmp/flink-1.15.2/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py", line 23, in <module> ModuleNotFoundError: No module named 'google' Executing word_count example with default input data set. Use --input to specify file input. org.apache.flink.client.program.ProgramAbortException: java.lang.RuntimeException: Python process exits with code: 1 3. On my Mac, with a virtual environment and Python package `*apache-flink`* installed: Flink tried to connect to localhost:8081 (I don't know why), and failed with 'connection refused'. Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted. at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:395) ... 21 more Caused by: java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8081 4. If I run that same example job using Python: `*python word_count.py*` then it runs well. I tried with both v1.15.2 and 1.15.1, Python 3.7 & 3.8, and got the same result. Could someone please help? Thanks.