I have run in a related issue I think, args passed to spark-submit to my cluster dispatcher get lost in translation when lauching the driver from mesos, I'm suggesting this patch:
https://github.com/jayv/spark/commit/b2025ddc1d565d1cc3036200fc3b3046578f4b02 - Jo Voordeckers On Thu, Nov 12, 2015 at 6:05 AM, John Omernik <[email protected]> wrote: > Hey all, > > I noticed today that if I take a tgz as my URI for Mesos, that I have to > repackaged it with my conf settings from where I execute say pyspark for > the executors to have the right configuration settings. > > That is... > > If I take a "stock" tgz from makedistribution.sh, unpack it, and then set > the URI in spark-defaults to be the unmodified tgz as the URI. Change other > settings in both spark-defaults.conf and spark-env.sh, then run > ./bin/pyspark from that unpacked directory, I guess I would have thought > that when the executor spun up, that some sort of magic was happening where > the conf directory or the conf settings would propagate out to the > executors (thus making configuration changes easier to manage) > > For things to work, I had to unpack the tgz, change conf settings, then > repackage the tgz with all my conf settings for the tgz in the URI then run > it. Then it seemed to work. > > I have a work around, but I guess, from a usability point of view, it > would be nice to have tgz that is "binaries" and that when it's run, it > takes the conf at run time. It would help with managing multiple > configurations that are using the same binaries (different models/apps etc) > Instead of having to repackage an tgz for each app, it would just > propagate...am I looking at this wrong? > > John > > >
