That makes a lot of sense, thanks for the concise answer! On Wed, Jul 22, 2015 at 4:10 PM, Andrew Or <and...@databricks.com> wrote:
> Hi Michael, > > In general, driver related properties should not be set through the > SparkConf. This is because by the time the SparkConf is created, we have > already started the driver JVM, so it's too late to change the memory, > class paths and other properties. > > In cluster mode, executor related properties should also not be set > through the SparkConf. This is because the driver is run on the cluster > just like the executors, and the executors are launched independently by > whatever the cluster manager (e.g. YARN) is configured to do. > > The recommended way of setting these properties is either through the > conf/spark-defaults.conf properties file, or through the spark-submit > command line, e.g.: > > bin/spark-shell --master yarn --executor-memory 2g --driver-memory 5g > > Let me know if that answers your question, > -Andrew > > > 2015-07-22 12:38 GMT-07:00 Michael Misiewicz <mmisiew...@gmail.com>: > >> Hi group, >> >> I seem to have encountered a weird problem with 'spark-submit' and >> manually setting sparkconf values in my applications. >> >> It seems like setting the configuration values spark.executor.memory >> and spark.driver.memory don't have any effect, when they are set from >> within my application (i.e. prior to creating a SparkContext). >> >> In yarn-cluster mode, only the values specified on the command line via >> spark-submit for driver and executor memory are respected, and if not, it >> appears spark falls back to defaults. For example, >> >> Correct behavior noted in Driver's logs on YARN when --executor-memory is >> specified: >> >> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Will request 200 executor >> containers, each with 1 cores and 13824 MB memory including 1536 MB overhead >> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Container request (host: Any, >> capability: <memory:13824, vCores:1>) >> >> >> But not when spark.executor.memory is specified prior to spark context >> initialization: >> >> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Will request 200 executor >> containers, each with 1 cores and 2560 MB memory including 1536 MB overhead >> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Container request (host: Any, >> capability: <memory:2560, vCores:1>) >> >> >> In both cases, executor mem should be 10g. Interestingly, I set a parameter >> spark.yarn.executor.memoryOverhead which appears to be respected whether or >> not I'm in yarn-cluster or yarn-client mode. >> >> >> Has anyone seen this before? Any idea what might be causing this behavior? >> >> >