The issue you are encountering is due to the order of operations when Spark
initializes the JVM for driver and executor pods. The JVM options
(-Dlog4j2.configurationFile) are evaluated when the JVM starts, but the
--files option copies the files after the JVM has already started. Hence,
the log4j configuration file is not found at the time the JVM is looking
for it.

In summary, you need to ensure the file is in place before the Spark driver
or executor JVM starts.

HTH

Mich Talebzadeh,

Technologist | Architect | Data Engineer  | Generative AI | FinCrime
PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
London <https://en.wikipedia.org/wiki/Imperial_College_London>
London, United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Thu, 6 Jun 2024 at 17:04, Jennifer Wirth <jennywirt...@gmail.com> wrote:

> Hi,
>
> I am trying to change the log4j configuration for jobs submitted to a k8s
> cluster (using submit)
>
> The my-log4j.xml is uploaded using --files ,./my-log4j.xml and the file
> in the working directory of the driver/exec pods.
>
> I added D-flags using the extra java options (and tried many different
> URIs, absolute, with and without file:.
>
> --conf spark.driver.extraJavaOptions="-Dlog4j2.debug=false 
> -Dlog4j2.configurationFile=file:./my-log4j.xml" \
>     --conf spark.executor.extraJavaOptions="-Dlog4j2.debug=false 
> -Dlog4j2.configurationFile=file:./my-log4j.xml" \
>
> When debugging i notice that log4j is not able to load my configuration
> file. I see the following additional log entries:
>
> ERROR StatusLogger Reconfiguration failed: No configuration found for 
> '4a87761d' at 'null' in 'null'
> ERROR StatusLogger Reconfiguration failed: No configuration found for 
> 'Default' at 'null' in 'null'
> 24/06/06 09:20:44 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> Files 
> file:///tmp/mk2/spark-upload-36b1f43d-1878-4b06-be5e-e25b703f28d5/orbit-movements.csv
>  from 
> /tmp/mk2/spark-upload-36b1f43d-1878-4b06-be5e-e25b703f28d5/orbit-movements.csv
>  to /opt/spark/work-dir/orbit-movements.csv
> Files 
> file:///tmp/mk2/spark-upload-cc211704-f481-4ebe-b6f0-5dbe66a7c639/my-log4j.xml
>  from /tmp/mk2/spark-upload-cc211704-f481-4ebe-b6f0-5dbe66a7c639/my-log4j.xml 
> to /opt/spark/work-dir/my-log4j.xml
> Files 
> file:///tmp/mk2/spark-upload-7970b482-7669-49aa-9f88-65191a83a18a/out.jar 
> from /tmp/mk2/spark-upload-7970b482-7669-49aa-9f88-65191a83a18a/out.jar to 
> /opt/spark/work-dir/out.jar
>
> The lines starting with Files in the logs of the Driver process, makes me
> wonder if the copying of files from my shared mount to the working
> directory happens in that process and is not something that happens before
> the java process launches. Is that assumption correct, as it would explain
> why my log4j config files are not found at JVM launch.
>
> If so, what is the recommended way to change the logging config *per job*
> when running spark in k8s (i am not using a custom container image, so
> can’t place it in there)
>
> tx.,
>

Reply via email to