Hi,

I'm new to PyFlink, and I couldn't run a basic example that shipped with
Flink.
This is the command I tried:

./bin/flink run -py examples/python/datastream/word_count.py

Here below are the results I got with different setups:

1. On AWS EMR 6.8.0 (Flink 1.15.1):
*Error: No module named 'google'*I tried with the Flink shipped with EMR,
or the binary v1.15.1/v1.15.2 downloaded from Flink site. I got that same
error message in all cases.

Traceback (most recent call last):

  File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main

    "__main__", mod_spec)

  File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code

    exec(code, run_globals)

  File
"/tmp/pyflink/622f19cc-6616-45ba-b7cb-20a1462c4c3f/a29eb7a4-fff1-4d98-9b38-56b25f97b70a/word_count.py",
line 134, in <module>

    word_count(known_args.input, known_args.output)

  File
"/tmp/pyflink/622f19cc-6616-45ba-b7cb-20a1462c4c3f/a29eb7a4-fff1-4d98-9b38-56b25f97b70a/word_count.py",
line 89, in word_count

    ds = ds.flat_map(split) \

  File
"/usr/lib/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py",
line 333, in flat_map

  File
"/usr/lib/flink/opt/python/pyflink.zip/pyflink/datastream/data_stream.py",
line 557, in process

  File
"/usr/lib/flink/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py",
line 23, in <module>

ModuleNotFoundError: No module named 'google'

org.apache.flink.client.program.ProgramAbortException:
java.lang.RuntimeException: Python process exits with code: 1

2. On my Mac, without a virtual environment (so `*-pyclientexec python3*`
is included in the run command): got the same error as with EMR, but
there's a stdout line from `*print()*` in the Python script

  File
"/Users/lvh/dev/tmp/flink-1.15.2/opt/python/pyflink.zip/pyflink/datastream/data_stream.py",
line 557, in process

  File "<frozen zipimport>", line 259, in load_module

  File
"/Users/lvh/dev/tmp/flink-1.15.2/opt/python/pyflink.zip/pyflink/fn_execution/flink_fn_execution_pb2.py",
line 23, in <module>

ModuleNotFoundError: No module named 'google'

Executing word_count example with default input data set.

Use --input to specify file input.

org.apache.flink.client.program.ProgramAbortException:
java.lang.RuntimeException: Python process exits with code: 1

3. On my Mac, with a virtual environment and Python package
`*apache-flink`* installed:
Flink tried to connect to localhost:8081 (I don't know why), and failed
with 'connection refused'.

Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException:
Could not complete the operation. Number of retries has been exhausted.

at
org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:395)

... 21 more

Caused by: java.util.concurrent.CompletionException:
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: localhost/127.0.0.1:8081

 4. If I run that same example job using Python: `*python word_count.py*`
then it runs well.

I tried with both v1.15.2 and 1.15.1, Python 3.7 & 3.8, and got the same
result.

Could someone please help?

Thanks.

Reply via email to