Hi Jonathan,

This is a problem that has come up for us as well, because we'd like
dynamic allocation to be turned on by default in some setups, but not break
existing users with these properties.  I'm hoping to figure out a way to
reconcile these by Spark 1.5.

-Sandy

On Wed, Jul 15, 2015 at 3:18 PM, Kelly, Jonathan <jonat...@amazon.com>
wrote:

>   Would there be any problem in having spark.executor.instances (or
> --num-executors) be completely ignored (i.e., even for non-zero values) if
> spark.dynamicAllocation.enabled is true (i.e., rather than throwing an
> exception)?
>
>  I can see how the exception would be helpful if, say, you tried to pass
> both "-c spark.executor.instances" (or --num-executors) *and* "-c
> spark.dynamicAllocation.enabled=true" to spark-submit on the command line
> (as opposed to having one of them in spark-defaults.conf and one of them in
> the spark-submit args), but currently there doesn't seem to be any way to
> distinguish between arguments that were actually passed to spark-submit and
> settings that simply came from spark-defaults.conf.
>
>  If there were a way to distinguish them, I think the ideal situation
> would be for the validation exception to be thrown only if
> spark.executor.instances and spark.dynamicAllocation.enabled=true were both
> passed via spark-submit args or were both present in spark-defaults.conf,
> but passing spark.dynamicAllocation.enabled=true to spark-submit would take
> precedence over spark.executor.instances configured in spark-defaults.conf,
> and vice versa.
>
>
>  Jonathan Kelly
>
> Elastic MapReduce - SDE
>
> Blackfoot (SEA33) 06.850.F0
>
>   From: Jonathan Kelly <jonat...@amazon.com>
> Date: Tuesday, July 14, 2015 at 4:23 PM
> To: "user@spark.apache.org" <user@spark.apache.org>
> Subject: Unable to use dynamicAllocation if spark.executor.instances is
> set in spark-defaults.conf
>
>   I've set up my cluster with a pre-calcualted value for
> spark.executor.instances in spark-defaults.conf such that I can run a job
> and have it maximize the utilization of the cluster resources by default.
> However, if I want to run a job with dynamicAllocation (by passing -c
> spark.dynamicAllocation.enabled=true to spark-submit), I get this exception:
>
>  Exception in thread "main" java.lang.IllegalArgumentException:
> Explicitly setting the number of executors is not compatible with
> spark.dynamicAllocation.enabled!
> at
> org.apache.spark.deploy.yarn.ClientArguments.parseArgs(ClientArguments.scala:192)
> at
> org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:59)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:54)
>  …
>
>  The exception makes sense, of course, but ideally I would like it to
> ignore what I've put in spark-defaults.conf for spark.executor.instances if
> I've enabled dynamicAllocation. The most annoying thing about this is that
> if I have spark.executor.instances present in spark-defaults.conf, I cannot
> figure out any way to spark-submit a job with
> spark.dynamicAllocation.enabled=true without getting this error. That is,
> even if I pass "-c spark.executor.instances=0 -c
> spark.dynamicAllocation.enabled=true", I still get this error because the
> validation in ClientArguments.parseArgs() that's checking for this
> condition simply checks for the presence of spark.executor.instances rather
> than whether or not its value is > 0.
>
>  Should the check be changed to allow spark.executor.instances to be set
> to 0 if spark.dynamicAllocation.enabled is true? That would be an OK
> compromise, but I'd really prefer to be able to enable dynamicAllocation
> simply by setting spark.dynamicAllocation.enabled=true rather than by also
> having to set spark.executor.instances to 0.
>
>
>  Thanks,
>
> Jonathan
>

Reply via email to