Nathanael England created FLINK-32939: -----------------------------------------
Summary: pyflink 1.17.0 has missing transitive dependency for pyopenssl Key: FLINK-32939 URL: https://issues.apache.org/jira/browse/FLINK-32939 Project: Flink Issue Type: Bug Environment: Ubuntu 20.04 Flink 1.17.0 Reporter: Nathanael England When running a pyflink job recently, we got an error about not being able to import something from pyopenssl correctly. Here's the traceback. {code:bash} E Caused by: java.lang.RuntimeException: Failed to create stage bundle factory! Traceback (most recent call last): E File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main E return _run_code(code, main_globals, None, E File "/usr/lib/python3.8/runpy.py", line 87, in _run_code E exec(code, run_globals) E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pyflink/fn_execution/beam/beam_boot.py", line 36, in <module> E from apache_beam.portability.api.org.apache.beam.model.fn_execution.v1.beam_fn_api_pb2 import \ E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/apache_beam/__init__.py", line 93, in <module> E from apache_beam import io E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/apache_beam/io/__init__.py", line 27, in <module> E from apache_beam.io.mongodbio import * E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/apache_beam/io/mongodbio.py", line 93, in <module> E from bson import json_util E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/bson/json_util.py", line 130, in <module> E from pymongo.errors import ConfigurationError E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pymongo/__init__.py", line 114, in <module> E from pymongo.collection import ReturnDocument E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pymongo/collection.py", line 26, in <module> E from pymongo import common, helpers, message E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pymongo/common.py", line 38, in <module> E from pymongo.ssl_support import validate_allow_invalid_certs, validate_cert_reqs E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pymongo/ssl_support.py", line 27, in <module> E import pymongo.pyopenssl_context as _ssl E File "/home/buildbot/.cache/pants/named_caches/pex_root/venvs/s/361094c4/venv/lib/python3.8/site-packages/pymongo/pyopenssl_context.py", line 27, in <module> E from OpenSSL import SSL as _SSL E File "/usr/local/lib/python3.8/dist-packages/OpenSSL/__init__.py", line 8, in <module> E from OpenSSL import crypto, SSL E File "/usr/local/lib/python3.8/dist-packages/OpenSSL/crypto.py", line 1556, in <module> E class X509StoreFlags(object): E File "/usr/local/lib/python3.8/dist-packages/OpenSSL/crypto.py", line 1577, in X509StoreFlags E CB_ISSUER_CHECK = _lib.X509_V_FLAG_CB_ISSUER_CHECK E AttributeError: module 'lib' has no attribute 'X509_V_FLAG_CB_ISSUER_CHECK' {code} It seems to be the case from this traceback that apache-flink depends on apache-beam which depends on pymongo which wants to depend on pyopenssl. In order to do that within the pymongo library, users need to specify `pymongo[ocsp]` as their dependency instead of just `pymongo`. It looks like apache-beam is just specifying `pymongo` and then doing some horrible python path mutilation to find some random installation on the system path. The tool we are using (pantsbuild) modifies python path at the start, so it shouldn't have been possible to find this installation. I believe this is an Apache Beam problem, but Jira will not let me make an issue there. Since this affects all Flink python users, though, it seems appropriate to be here as whatever fix comes to Beam should be worked downstream into Flink. -- This message was sent by Atlassian Jira (v8.20.10#820010)