Srinivas Rishindra Pothireddi created SPARK-51095:
-----------------------------------------------------
Summary: Spark should include "caller context" for hdfs audit logs
on driver
Key: SPARK-51095
URL: https://issues.apache.org/jira/browse/SPARK-51095
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 4.0.0
Reporter: Srinivas Rishindra Pothireddi
HDFS audit logs include the ability to add a "caller context". Spark already
leverages this to set the yarn application id, job id, task id, etc. {_}but
only on executors{_}. The caller context is left empty on the spark driver.
(See https://issues.apache.org/jira/browse/SPARK-15857 & related hdfs jiras)
We should update spark to include the caller context on the driver as well, so
that it has the yarn application id.
This is also relevant for iceberg tables, as some table maintenance operations
may be done by the driver, and it would be good to be able to track the yarn
application for each of those files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]