Re: Losing executors due to memory problems

2016-08-12 Thread Bedrytski Aliaksandr
Hi Vinay, just out of curiosity, why are you converting your Dataframes into RDDs before the join? Join works quite well with Dataframes. As for your problem, it looks like you gave to your executors more memory than you physically have. As an example of executors configuration: > Cluster of 6 n

Re: Losing executors due to memory problems

2016-08-12 Thread Koert Kuipers
you could have a very large key? perhaps a token value? i love the rdd api but have found that for joins dataframe/dataset performs better. maybe can you do the joins in that? On Thu, Aug 11, 2016 at 7:41 PM, Muttineni, Vinay wrote: > Hello, > > I have a spark job that basically reads data from