Re: Regarding Spark on Kubernetes(EKS)

Mich Talebzadeh Mon, 19 Feb 2024 07:05:38 -0800

OK you have a jar file that you want to work with when running using Spark
on k8s as the execution engine (EKS) as opposed to  YARN on EMR as the
execution engine?



Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Mon, 19 Feb 2024 at 14:38, Jagannath Majhi <
jagannath.ma...@cloud.cbnits.com> wrote:

> I am not using any private docker image. Only I am running the jar file in
> EMR using spark-submit command so now I want to run this jar file in eks so
> can you please tell me how can I set-up for this ??
>
> On Mon, Feb 19, 2024, 8:06 PM Jagannath Majhi <
> jagannath.ma...@cloud.cbnits.com> wrote:
>
>> Can we connect over Google meet??
>>
>> On Mon, Feb 19, 2024, 8:03 PM Mich Talebzadeh <mich.talebza...@gmail.com>
>> wrote:
>>
>>> Where is your docker file? In ECR container registry.
>>> If you are going to use EKS, then it need to be accessible to all nodes
>>> of cluster
>>>
>>> When you build your docker image, put your jar under the $SPARK_HOME
>>> directory. Then add a line to your docker build file as below
>>> Here I am accessing Google BigQuery DW from EKS
>>> # Add a BigQuery connector jar.
>>> ENV SPARK_EXTRA_JARS_DIR=/opt/spark/jars/
>>> ENV SPARK_EXTRA_CLASSPATH='/opt/spark/jars/*'
>>> RUN mkdir -p "${SPARK_EXTRA_JARS_DIR}" \
>>>     && chown spark:spark "${SPARK_EXTRA_JARS_DIR}"
>>> COPY --chown=spark:spark \
>>>     spark-bigquery-with-dependencies_2.12-0.22.2.jar
>>> "${SPARK_EXTRA_JARS_DIR}"
>>>
>>> Here I am accessing Google BigQuery DW from EKS cluster
>>>
>>> HTH
>>>
>>> Mich Talebzadeh,
>>> Dad | Technologist | Solutions Architect | Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* The information provided is correct to the best of my
>>> knowledge but of course cannot be guaranteed . It is essential to note
>>> that, as with any advice, quote "one test result is worth one-thousand
>>> expert opinions (Werner
>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>
>>>
>>> On Mon, 19 Feb 2024 at 13:42, Jagannath Majhi <
>>> jagannath.ma...@cloud.cbnits.com> wrote:
>>>
>>>> Dear Spark Community,
>>>>
>>>> I hope this email finds you well. I am reaching out to seek assistance
>>>> and guidance regarding a task I'm currently working on involving Apache
>>>> Spark.
>>>>
>>>> I have developed a JAR file that contains some Spark applications and
>>>> functionality, and I need to run this JAR file within a Spark cluster.
>>>> However, the JAR file is located in an AWS S3 bucket. I'm facing some
>>>> challenges in configuring Spark to access and execute this JAR file
>>>> directly from the S3 bucket.
>>>>
>>>> I would greatly appreciate any advice, best practices, or pointers on
>>>> how to achieve this integration effectively. Specifically, I'm looking for
>>>> insights on:
>>>>
>>>>    1. Configuring Spark to access and retrieve the JAR file from an
>>>>    AWS S3 bucket.
>>>>    2. Setting up the necessary permissions and authentication
>>>>    mechanisms to ensure seamless access to the S3 bucket.
>>>>    3. Any potential performance considerations or optimizations when
>>>>    running Spark applications with dependencies stored in remote storage 
>>>> like
>>>>    AWS S3.
>>>>
>>>> If anyone in the community has prior experience or knowledge in this
>>>> area, I would be extremely grateful for your guidance. Additionally, if
>>>> there are any relevant resources, documentation, or tutorials that you
>>>> could recommend, it would be incredibly helpful.
>>>>
>>>> Thank you very much for considering my request. I look forward to
>>>> hearing from you and benefiting from the collective expertise of the Spark
>>>> community.
>>>>
>>>> Best regards, Jagannath Majhi
>>>>
>>>

Re: Regarding Spark on Kubernetes(EKS)

Reply via email to