Andrew - I think your JIRA may duplicate existing work: https://github.com/apache/spark/pull/1513
On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or <and...@databricks.com> wrote: > @Cody I took a quick glance at the Mesos code and it appears that we > currently do not even pass extra java options to executors except in coarse > grained mode, and even in this mode we do not pass them to executors > correctly. I have filed a related JIRA here: > https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat > serious limitation and we will try to fix this for 1.1. > > -Andrew > > > 2014-08-07 19:42 GMT-07:00 Andrew Or <and...@databricks.com>: > >> Thanks Marcelo, I have moved the changes to a new PR to describe the >> problems more clearly: https://github.com/apache/spark/pull/1845 >> >> @Gary Yeah, the goal is to get this into 1.1 as a bug fix. >> >> >> 2014-08-07 17:30 GMT-07:00 Gary Malouf <malouf.g...@gmail.com>: >> >> Can this be cherry-picked for 1.1 if everything works out? In my opinion, >>> it could be qualified as a bug fix. >>> >>> >>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <van...@cloudera.com> >>> wrote: >>> >>> > Andrew has been working on a fix: >>> > https://github.com/apache/spark/pull/1770 >>> > >>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <c...@koeninger.org> >>> wrote: >>> > > Just wanted to check in on this, see if I should file a bug report >>> > > regarding the mesos argument propagation. >>> > > >>> > > >>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger <c...@koeninger.org> >>> > wrote: >>> > > >>> > >> 1. I've tried with and without escaping equals sign, it doesn't >>> affect >>> > the >>> > >> results. >>> > >> >>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for >>> getting >>> > >> system properties set in the local shell (although not for >>> executors). >>> > >> >>> > >> 3. We're using the default fine-grained mesos mode, not setting >>> > >> spark.mesos.coarse, so it doesn't seem immediately related to that >>> > ticket. >>> > >> Should I file a bug report? >>> > >> >>> > >> >>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwend...@gmail.com >>> > >>> > >> wrote: >>> > >> >>> > >>> The third issue may be related to this: >>> > >>> https://issues.apache.org/jira/browse/SPARK-2022 >>> > >>> >>> > >>> We can take a look at this during the bug fix period for the 1.1 >>> > >>> release next week. If we come up with a fix we can backport it into >>> > >>> the 1.0 branch also. >>> > >>> >>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell < >>> pwend...@gmail.com> >>> > >>> wrote: >>> > >>> > Thanks for digging around here. I think there are a few distinct >>> > issues. >>> > >>> > >>> > >>> > 1. Properties containing the '=' character need to be escaped. >>> > >>> > I was able to load properties fine as long as I escape the '=' >>> > >>> > character. But maybe we should document this: >>> > >>> > >>> > >>> > == spark-defaults.conf == >>> > >>> > spark.foo a\=B >>> > >>> > == shell == >>> > >>> > scala> sc.getConf.get("spark.foo") >>> > >>> > res2: String = a=B >>> > >>> > >>> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties file, >>> > >>> > don't affect the driver when running in client mode (always the >>> case >>> > >>> > for mesos). We should probably document this. In this case you >>> need >>> > to >>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS. >>> > >>> > >>> > >>> > 3. Arguments aren't propagated on Mesos (this might be because of >>> the >>> > >>> > other issues, or a separate bug). >>> > >>> > >>> > >>> > - Patrick >>> > >>> > >>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger < >>> c...@koeninger.org> >>> > >>> wrote: >>> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to >>> behave >>> > >>> as I >>> > >>> >> would expect; java arguments don't seem to be propagated to >>> > executors. >>> > >>> >> >>> > >>> >> >>> > >>> >> $ cat conf/spark-defaults.conf >>> > >>> >> >>> > >>> >> spark.master >>> > >>> >> >>> > >>> >>> > >>> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters >>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23 >>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 >>> > >>> >> >>> > >>> >> >>> > >>> >> $ ./bin/spark-shell >>> > >>> >> >>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions") >>> > >>> >> res0: String = -Dfoo.bar.baz=23 >>> > >>> >> >>> > >>> >> scala> sc.parallelize(1 to 100).map{ i => ( >>> > >>> >> | java.net.InetAddress.getLocalHost.getHostName, >>> > >>> >> | System.getProperty("foo.bar.baz") >>> > >>> >> | )}.collect >>> > >>> >> >>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null), >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null), >>> > >>> >> (dn-02.mxstg,null), ... >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> Note that this is a mesos deployment, although I wouldn't expect >>> > that >>> > >>> to >>> > >>> >> affect the availability of spark.driver.extraJavaOptions in a >>> local >>> > >>> spark >>> > >>> >> shell. >>> > >>> >> >>> > >>> >> >>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger < >>> c...@koeninger.org >>> > > >>> > >>> wrote: >>> > >>> >> >>> > >>> >>> Either whitespace or equals sign are valid properties file >>> formats. >>> > >>> >>> Here's an example: >>> > >>> >>> >>> > >>> >>> $ cat conf/spark-defaults.conf >>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 >>> > >>> >>> >>> > >>> >>> $ ./bin/spark-shell -v >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf >>> > >>> >>> Adding default property: >>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23 >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> scala> System.getProperty("foo.bar.baz") >>> > >>> >>> res0: String = null >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> If you add double quotes, the resulting string value will have >>> > double >>> > >>> >>> quotes. >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> $ cat conf/spark-defaults.conf >>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23" >>> > >>> >>> >>> > >>> >>> $ ./bin/spark-shell -v >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf >>> > >>> >>> Adding default property: >>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23" >>> > >>> >>> >>> > >>> >>> scala> System.getProperty("foo.bar.baz") >>> > >>> >>> res0: String = null >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> Neither one of those affects the issue; the underlying problem >>> in >>> > my >>> > >>> case >>> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and >>> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses >>> > >>> >>> spark-defaults.conf before the java process is started. >>> > >>> >>> >>> > >>> >>> Here's an example of the process running when only >>> > >>> spark-defaults.conf is >>> > >>> >>> being used: >>> > >>> >>> >>> > >>> >>> $ ps -ef | grep spark >>> > >>> >>> >>> > >>> >>> 514 5182 2058 0 21:05 pts/2 00:00:00 bash >>> > >>> ./bin/spark-shell -v >>> > >>> >>> >>> > >>> >>> 514 5189 5182 4 21:05 pts/2 00:00:22 >>> > >>> /usr/local/java/bin/java >>> > >>> >>> -cp >>> > >>> >>> >>> > >>> >>> > >>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx >>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m >>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class >>> > >>> >>> org.apache.spark.repl.Main >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> Here's an example of it when the command line >>> > --driver-java-options is >>> > >>> >>> used (and thus things work): >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> $ ps -ef | grep spark >>> > >>> >>> 514 5392 2058 0 21:15 pts/2 00:00:00 bash >>> > >>> ./bin/spark-shell -v >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 >>> > >>> >>> >>> > >>> >>> 514 5399 5392 80 21:15 pts/2 00:00:06 >>> > >>> /usr/local/java/bin/java >>> > >>> >>> -cp >>> > >>> >>> >>> > >>> >>> > >>> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx >>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= >>> -Xms512m >>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class >>> > >>> org.apache.spark.repl.Main >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell < >>> > pwend...@gmail.com> >>> > >>> >>> wrote: >>> > >>> >>> >>> > >>> >>>> Cody - in your example you are using the '=' character, but in >>> our >>> > >>> >>>> documentation and tests we use a whitespace to separate the key >>> > and >>> > >>> >>>> value in the defaults file. >>> > >>> >>>> >>> > >>> >>>> docs: http://spark.apache.org/docs/latest/configuration.html >>> > >>> >>>> >>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 >>> > >>> >>>> >>> > >>> >>>> I'm not sure if the java properties file parser will try to >>> > interpret >>> > >>> >>>> the equals sign. If so you might need to do this. >>> > >>> >>>> >>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23" >>> > >>> >>>> >>> > >>> >>>> Do those work for you? >>> > >>> >>>> >>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin < >>> > van...@cloudera.com >>> > >>> > >>> > >>> >>>> wrote: >>> > >>> >>>> > Hi Cody, >>> > >>> >>>> > >>> > >>> >>>> > Could you file a bug for this if there isn't one already? >>> > >>> >>>> > >>> > >>> >>>> > For system properties SparkSubmit should be able to read >>> those >>> > >>> >>>> > settings and do the right thing, but that obviously won't >>> work >>> > for >>> > >>> >>>> > other JVM options... the current code should work fine in >>> > cluster >>> > >>> mode >>> > >>> >>>> > though, since the driver is a different process. :-) >>> > >>> >>>> > >>> > >>> >>>> > >>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger < >>> > >>> c...@koeninger.org> >>> > >>> >>>> wrote: >>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system >>> > >>> properties >>> > >>> >>>> via >>> > >>> >>>> >> -D. >>> > >>> >>>> >> >>> > >>> >>>> >> This was used for properties that varied on a >>> > >>> >>>> per-deployment-environment >>> > >>> >>>> >> basis, but needed to be available in the spark shell and >>> > workers. >>> > >>> >>>> >> >>> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been >>> > >>> deprecated, >>> > >>> >>>> and >>> > >>> >>>> >> replaced by spark-defaults.conf and command line arguments >>> to >>> > >>> >>>> spark-submit >>> > >>> >>>> >> or spark-shell. >>> > >>> >>>> >> >>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and >>> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is >>> not a >>> > >>> >>>> replacement >>> > >>> >>>> >> for SPARK_JAVA_OPTS: >>> > >>> >>>> >> >>> > >>> >>>> >> >>> > >>> >>>> >> $ cat conf/spark-defaults.conf >>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23 >>> > >>> >>>> >> >>> > >>> >>>> >> $ ./bin/spark-shell >>> > >>> >>>> >> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz") >>> > >>> >>>> >> res0: String = null >>> > >>> >>>> >> >>> > >>> >>>> >> >>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23" >>> > >>> >>>> >> >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz") >>> > >>> >>>> >> res0: String = 23 >>> > >>> >>>> >> >>> > >>> >>>> >> >>> > >>> >>>> >> Looking through the shell scripts for spark-submit and >>> > >>> spark-class, I >>> > >>> >>>> can >>> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash >>> could be >>> > >>> >>>> brittle. >>> > >>> >>>> >> >>> > >>> >>>> >> But from an ergonomic point of view, it's a step back to go >>> > from a >>> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to >>> > requiring >>> > >>> >>>> command >>> > >>> >>>> >> line arguments. >>> > >>> >>>> >> >>> > >>> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell >>> with >>> > >>> the >>> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue up to >>> > see >>> > >>> if >>> > >>> >>>> anyone >>> > >>> >>>> >> else had run into it, >>> > >>> >>>> >> or had any direction for a general solution (beyond parsing >>> > java >>> > >>> >>>> properties >>> > >>> >>>> >> files from bash). >>> > >>> >>>> > >>> > >>> >>>> > >>> > >>> >>>> > >>> > >>> >>>> > -- >>> > >>> >>>> > Marcelo >>> > >>> >>>> >>> > >>> >>> >>> > >>> >>> >>> > >>> >>> > >> >>> > >> >>> > >>> > >>> > >>> > -- >>> > Marcelo >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: dev-h...@spark.apache.org >>> > >>> > >>> >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org