Biao Geng created FLINK-38585:
---------------------------------
Summary: Pyflink's thread mode may not work when using shipped
venv.zip archive as virtual env
Key: FLINK-38585
URL: https://issues.apache.org/jira/browse/FLINK-38585
Project: Flink
Issue Type: Bug
Components: API / Python
Reporter: Biao Geng
To use virtual env to run pyflink jobs in k8s cluster, we can use the option
`python.archives` to specify a venv.zip built in advance (see
[here|https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/python/faq/#preparing-python-virtual-environment]
for more details). This works in Process mode, but, when using thread mode
based on pemja, we notice that such errors can happen:
{code:java}
2025-10-29 11:21:45,229 [Source: Collection Source -> Map, Map -> Sink: Print
to Std. Out (1/1)#1] INFO org.apache.flink.python.util.CompressionUtils
[] - extractFile duration: 8622 ms
2025-10-29 11:21:45,230 [Source: Collection Source -> Map, Map -> Sink: Print
to Std. Out (1/1)#1] INFO
org.apache.flink.python.env.AbstractPythonEnvironmentManager [] - Python
interpreter path: venv3.zip/venv/bin/python
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = 'python3'
isolated = 0
environment = 1
user site = 1
import site = 1
sys._base_executable = '/usr/bin/python3'
sys.base_prefix = '/root/venv'
sys.base_exec_prefix = '/root/venv'
sys.platlibdir = 'lib'
sys.executable = '/usr/bin/python3'
sys.prefix = '/root/venv'
sys.exec_prefix = '/root/venv'
sys.path = [
'/root/venv/lib/python310.zip',
'/root/venv/lib/python3.10',
'/root/venv/lib/lib-dynload',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the
filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'Current thread
0x00007f52466fe700 (most recent call first):
<no Python frame> {code}
The error shows that PYTHONHOME is not set correctly which leads to the error
during the initialization of the pemja's main interpreter. As a result, the
default python interpreter in the env will be used, which could be pretty old
and miss required dependecies.
After some debugging, I find the cause is that pemja sets python home later
than doing Py_initialize (see
[codes|https://github.com/alibaba/pemja/blob/4a3948d34d4fde1059a501839105d23f784869ab/src/main/c/pemja/core/pylib.c#L343]
for more details). This leads to wrong PYTHONHOME.
We should fix it to allow using thread mode in venv.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)