Niha created FLINK-38035:
----------------------------

             Summary: Security Vulnerability in PyFlink Logging Mechanism 
(PythonEnvUtils.java)
                 Key: FLINK-38035
                 URL: https://issues.apache.org/jira/browse/FLINK-38035
             Project: Flink
          Issue Type: Bug
          Components: API / Python
    Affects Versions: 1.20.1, 1.19.1
            Reporter: Niha


Potential security vulnerability in the logging statement within 
{{PythonEnvUtils.java}} that may expose environment variables — including 
Kubernetes-mounted secrets — during PyFlink job submission.



The class 
[{{org.apache.flink.client.python.PythonEnvUtils}}|https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonEnvUtils.java#L372-L377]
 logs all environment variables at job startup with the following line:

 
{{{}LOG.info("Starting Python process with environment variables: {}", 
environment);{}}}{{{}{}}}
This line is problematic because it indiscriminately logs {*}all environment 
variables{*}, which may contain {*}sensitive credentials{*}.


h4. *Context: Kubernetes Operator Users Are Especially at Risk*

When Flink is deployed using the {*}Flink Kubernetes Operator{*}, secrets are 
commonly passed into pods as *environment variables* (via Kubernetes {{env}} or 
{{envFrom}} fields, e.g. from {{{}secretRef{}}}).

This includes:
 * Database credentials

 * Cloud service keys (e.g., {{{}AWS_SECRET_ACCESS_KEY{}}})

 * Tokens and encryption keys

 * Custom user-defined secrets

Logging these secrets in plain text within the Flink JobManager or TaskManager 
logs violates Kubernetes security best practices, which explicitly discourage 
exposing sensitive environment variables in logs, and poses a serious risk in 
production environments.
h4. *Proposed Fix*
 * Redact known sensitive keys ({{{}*_SECRET_*{}}}, {{{}*_TOKEN{}}}, 
{{{}*_KEY{}}}, {{{}PASSWORD{}}}, etc.) before logging.

Example fix snippet:
{{Map<String, String> redactedEnv = redactSensitive(environment);
LOG.info("Starting Python process with environment variables: {}", 
redactedEnv);}} * Consider an opt-in mechanism (e.g., 
{{{}log.python.env=true{}}}) for full environment visibility in safe/test 
setups.

h4. *Steps to Reproduce*
 # Set Kubernetes secrets as environment variables in a FlinkDeployment (e.g., 
via {{{}envFrom.secretRef{}}}).

 # Launch a PyFlink job using the Flink Kubernetes Operator.

 # Examine the JobManager logs.

 # Observe secrets printed via {{{}PythonEnvUtils.java{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to