Hi all,

I am creating a base Spark image that we are using internally.
We need to add some packages to the base image:
spark:3.5.1-scala2.12-java17-python3-r-ubuntu

Of course I do not want to Start Spark with --packages "..." - as it is not
efficient at all - I would like to add the needed jars to the image.

Ideally, I would have add to my image something that will add the needed
packages - something like:

RUN $SPARK_HOME/bin/add-packages "..."

But AFAIK there is no such option.

Other than running Spark to add those packages and then creating the image
- or running Spark always with --packages "..."  - what can I do?
Is there a way to run just the code that is run by the --package command -
without running Spark, so I can add the needed dependencies to my image?

I am sure this is something that I am not the only one nor the first one to
encounter...

Thanks!
Nimrod

Reply via email to