Hi Sandy,
We are, yes. I strongly suspect we're not partitioning our data properly,
but maybe 1.5G is simply too small for our workload. I'll bump the executor
memory and see if we get better results.
It seems we should be setting it to (SPARK_WORKER_MEMORY + pyspark memory)
/ # of concurrent app
Are you aware that you get an executor (and the 1.5GB) per machine, not per
core?
On Tue, Mar 11, 2014 at 12:52 PM, Aaron Olson wrote:
> Hi Sandy,
>
> We're configuring that with the JAVA_OPTS environment variable in
> $SPARK_HOME/spark-worker-env.sh like this:
>
> # JAVA OPTS
> export SPARK_JA
Hi Sandy,
We're configuring that with the JAVA_OPTS environment variable in
$SPARK_HOME/spark-worker-env.sh like this:
# JAVA OPTS
export SPARK_JAVA_OPTS="-Dspark.ui.port=0 -Dspark.default.parallelism=1024
-Dspark.cores.max=256 -Dspark.executor.memory=1500m
-Dspark.worker.timeout=500 -Dspark.akka
Hi Aaron,
When you say "Java heap space is 1.5G per worker, 24 or 32 cores across 46
nodes. It seems like we should have more than enough to do this
comfortably.", how are you configuring this?
-Sandy
On Tue, Mar 11, 2014 at 10:11 AM, Aaron Olson wrote:
> Dear Sparkians,
>
> We are working on