Hi Nimrod,
One approach would be to identify required additional jars and include them
in the Docker file (path /opt/spark/jars) for the spark image ..
that approach worked for me.
Alternately you might need to add the packages in the SparkApplication
yaml

HTH,
Karan Alang

On Tue, Oct 15, 2024 at 11:45 AM Nimrod Ofek <ofek.nim...@gmail.com> wrote:

> Hi all,
>
> I am creating a base Spark image that we are using internally.
> We need to add some packages to the base image:
> spark:3.5.1-scala2.12-java17-python3-r-ubuntu
>
> Of course I do not want to Start Spark with --packages "..." - as it is
> not efficient at all - I would like to add the needed jars to the image.
>
> Ideally, I would have add to my image something that will add the needed
> packages - something like:
>
> RUN $SPARK_HOME/bin/add-packages "..."
>
> But AFAIK there is no such option.
>
> Other than running Spark to add those packages and then creating the image
> - or running Spark always with --packages "..."  - what can I do?
> Is there a way to run just the code that is run by the --package command -
> without running Spark, so I can add the needed dependencies to my image?
>
> I am sure this is something that I am not the only one nor the first one
> to encounter...
>
> Thanks!
> Nimrod
>
>
>

Reply via email to