Re: Spark shell crumbles after memory is full

2015-06-29 Thread Hans van den Bogert
Would there be a way to force the 'old' data out? Because at this point I'll have to restart the shell every couple of queries to get meaningful timings which are comparable to spark submit . On Jun 29, 2015 6:20 PM, "Mark Hamstra" wrote: > No. He is collecting the results of the SQL query, not

Re: Spark shell crumbles after memory is full

2015-06-29 Thread Mark Hamstra
No. He is collecting the results of the SQL query, not the whole dataset. The REPL does retain references to prior results, so it's not really the best tool to be using when you want no-longer-needed results to be automatically garbage collected. On Mon, Jun 29, 2015 at 9:13 AM, ayan guha wrote:

Re: Spark shell crumbles after memory is full

2015-06-29 Thread ayan guha
When you call collect, you are bringing whole dataset back to driver memory. On 30 Jun 2015 01:43, "hbogert" wrote: > I'm running a query from the BigDataBenchmark, query 1B to be precise. > > When running this with Spark (1.3.1)+ mesos(0.21) in coarse grained mode > with 5 mesos slave, through a