Re: Regarding Spark on Kubernetes(EKS)

Richard Smith Mon, 19 Feb 2024 06:15:03 -0800

I run my Spark jobs in GCP with Google Dataproc using GCS buckets.

I've not used AWS, but its EMR product offers similar functionality toDataproc. The title of your post implies your Spark cluster runs on EKS.You might be better off using EMR, see links below:

EMRhttps://medium.com/big-data-on-amazon-elastic-mapreduce/run-a-spark-job-within-amazon-emr-in-15-minutes-68b02af1ae16


EKS https://medium.com/@vikas.navlani/running-spark-on-aws-eks-1cd4c31786c

Richard

On 19/02/2024 13:36, Jagannath Majhi wrote:

Dear Spark Community,
I hope this email finds you well. I am reaching out to seek assistanceand guidance regarding a task I'm currently working on involvingApache Spark.
I have developed a JAR file that contains some Spark applications andfunctionality, and I need to run this JAR file within a Spark cluster.However, the JAR file is located in an AWS S3 bucket. I'm facing somechallenges in configuring Spark to access and execute this JAR filedirectly from the S3 bucket.
I would greatly appreciate any advice, best practices, or pointers onhow to achieve this integration effectively. Specifically, I'm lookingfor insights on:
 1. Configuring Spark to access and retrieve the JAR file from an AWS
    S3 bucket.
 2. Setting up the necessary permissions and authentication mechanisms
    to ensure seamless access to the S3 bucket.
 3. Any potential performance considerations or optimizations when
    running Spark applications with dependencies stored in remote
    storage like AWS S3.
If anyone in the community has prior experience or knowledge in thisarea, I would be extremely grateful for your guidance. Additionally,if there are any relevant resources, documentation, or tutorials thatyou could recommend, it would be incredibly helpful.
Thank you very much for considering my request. I look forward tohearing from you and benefiting from the collective expertise of theSpark community.
Best regards, Jagannath Majhi

Re: Regarding Spark on Kubernetes(EKS)

Reply via email to