from:"javaguy Java"

Re: OutOfMemoryError

2021-07-06 Thread javaguy Java

, Jul 1, 2021 at 1:43 PM Sean Owen wrote: > You need to set driver memory before the driver starts, on the CLI or > however you run your app, not in the app itself. By the time the driver > starts to run your app, its heap is already set. > > On Thu, Jul 1, 2021 at 12:10 AM

OutOfMemoryError

2021-06-30 Thread javaguy Java

Hi, I'm getting Java OOM errors even though I'm setting my driver memory to 24g and I'm executing against local[*] I was wondering if anyone can give me any insight. The server this job is running on has more than enough memory as does the spark driver. The final result does write 3 csv files t

Spark as an application server cache

2021-02-10 Thread javaguy Java

Hi, I was just curious if anyone has ever used Spark as an application server cache? My use case is: * I have large datasets which need to be updated / inserted (upsert) in the database * I have actually found that it is much easier to run a Spark submit job that pulls from the database, and co

Re: A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

2020-09-25 Thread javaguy Java

; You can try tools like https://codait.github.io/spark-bench/ to > generate large workloads. > > On Fri, Sep 25, 2020 at 1:03 AM javaguy Java wrote: > > > > Hi Sean, > > > > Thanks for your reply. > > > > I understand distribution and parallelism very well and

Re: A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

2020-09-24 Thread javaguy Java

cluster, the input, etc. You may not see a speedup in this problem > until you hit more scale or modify the job to distribute a little > better, etc. > > On Thu, Sep 24, 2020 at 1:43 PM javaguy Java wrote: > > > > Hi, > > > > I made a post on stackoverflow tha

A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

2020-09-24 Thread javaguy Java

Hi, I made a post on stackoverflow that I can't seem to make any headway on https://stackoverflow.com/questions/63834379/spark-performance-local-faster-than-cluster Before someone starts making suggestions on changing the code; note that the code and example on the above post is from a Udemy cour

Re: OutOfMemoryError

OutOfMemoryError

Spark as an application server cache

Re: A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

Re: A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

A simple example that demonstrates that a Spark distributed cluster is faster than Spark Local Standalone

6 matches

Site Navigation

Mail list logo

Footer information