Pyspark Memory Woes

2014-03-11 Thread Aaron Olson
doing, specifically around moving data in and out of python land? I realise it's hard to troubleshoot in the absence of code but any test case we have would be contrived. We're collecting more metrics and trying to reason about what might be happening, but any guidance at this point would be most helpful. Thanks! -- Aaron Olson Data Engineer, Shopify

Re: Pyspark Memory Woes

2014-03-11 Thread Aaron Olson
his > comfortably.", how are you configuring this? > > -Sandy > > > On Tue, Mar 11, 2014 at 10:11 AM, Aaron Olson wrote: > >> Dear Sparkians, >> >> We are working on a system to do relational modeling on top of Spark, all >> done in pyspark. While

Re: Pyspark Memory Woes

2014-03-12 Thread Aaron Olson
ote: > Are you aware that you get an executor (and the 1.5GB) per machine, not > per core? > > > > On Tue, Mar 11, 2014 at 12:52 PM, Aaron Olson wrote: > >> Hi Sandy, >> >> We're configuring that with the JAVA_OPTS environment variable in >> $SPAR