, Jul 1, 2021 at 1:43 PM Sean Owen wrote:
> You need to set driver memory before the driver starts, on the CLI or
> however you run your app, not in the app itself. By the time the driver
> starts to run your app, its heap is already set.
>
> On Thu, Jul 1, 2021 at 12:10 AM
Hi,
I'm getting Java OOM errors even though I'm setting my driver memory to 24g
and I'm executing against local[*]
I was wondering if anyone can give me any insight. The server this
job is running on has more than enough memory as does the spark
driver.
The final result does write 3 csv files t
Hi,
I was just curious if anyone has ever used Spark as an application server
cache?
My use case is:
* I have large datasets which need to be updated / inserted (upsert) in
the database
* I have actually found that it is much easier to run a Spark submit job
that pulls from the database, and co
; You can try tools like https://codait.github.io/spark-bench/ to
> generate large workloads.
>
> On Fri, Sep 25, 2020 at 1:03 AM javaguy Java wrote:
> >
> > Hi Sean,
> >
> > Thanks for your reply.
> >
> > I understand distribution and parallelism very well and
cluster, the input, etc. You may not see a speedup in this problem
> until you hit more scale or modify the job to distribute a little
> better, etc.
>
> On Thu, Sep 24, 2020 at 1:43 PM javaguy Java wrote:
> >
> > Hi,
> >
> > I made a post on stackoverflow tha
Hi,
I made a post on stackoverflow that I can't seem to make any headway on
https://stackoverflow.com/questions/63834379/spark-performance-local-faster-than-cluster
Before someone starts making suggestions on changing the code; note that
the code and example on the above post is from a Udemy cour