Ok great I’ll give that a shot - Thanks for all the help
> On Apr 14, 2018, at 12:08 PM, Gene Pang <gene.p...@gmail.com> wrote: > > Yes, I think that is the case. I haven't tried that before, but it should > work. > > Thanks, > Gene > > On Fri, Apr 13, 2018 at 11:32 AM, Jason Boorn <jbo...@gmail.com > <mailto:jbo...@gmail.com>> wrote: > Hi Gene - > > Are you saying that I just need to figure out how to get the Alluxio jar into > the classpath of my parent application? If it shows up in the classpath then > Spark will automatically know that it needs to use it when communicating with > Alluxio? > > Apologies for going back-and-forth on this - I feel like my particular use > case is clouding what is already a tricky issue. > >> On Apr 13, 2018, at 2:26 PM, Gene Pang <gene.p...@gmail.com >> <mailto:gene.p...@gmail.com>> wrote: >> >> Hi Jason, >> >> Alluxio does work with Spark in master=local mode. This is because both >> spark-submit and spark-shell have command-line options to set the classpath >> for the JVM that is being started. >> >> If you are not using spark-submit or spark-shell, you will have to figure >> out how to configure that JVM instance with the proper properties. >> >> Thanks, >> Gene >> >> On Fri, Apr 13, 2018 at 10:47 AM, Jason Boorn <jbo...@gmail.com >> <mailto:jbo...@gmail.com>> wrote: >> Ok thanks - I was basing my design on this: >> >> https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html >> >> <https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html> >> >> Wherein it says: >> Once the SparkSession is instantiated, you can configure Spark’s runtime >> config properties. >> Apparently the suite of runtime configs you can change does not include >> classpath. >> >> So the answer to my original question is basically this: >> >> When using local (pseudo-cluster) mode, there is no way to add external jars >> to the spark instance. This means that Alluxio will not work with Spark >> when Spark is run in master=local mode. >> >> Thanks again - often getting a definitive “no” is almost as good as a yes. >> Almost ;) >> >>> On Apr 13, 2018, at 1:21 PM, Marcelo Vanzin <van...@cloudera.com >>> <mailto:van...@cloudera.com>> wrote: >>> >>> There are two things you're doing wrong here: >>> >>> On Thu, Apr 12, 2018 at 6:32 PM, jb44 <jbo...@gmail.com >>> <mailto:jbo...@gmail.com>> wrote: >>>> Then I can add the alluxio client library like so: >>>> sparkSession.conf.set("spark.driver.extraClassPath", ALLUXIO_SPARK_CLIENT) >>> >>> First one, you can't modify JVM configuration after it has already >>> started. So this line does nothing since it can't re-launch your >>> application with a new JVM. >>> >>>> sparkSession.conf.set("spark.executor.extraClassPath", >>>> ALLUXIO_SPARK_CLIENT) >>> >>> There is a lot of configuration that you cannot set after the >>> application has already started. For example, after the session is >>> created, most probably this option will be ignored, since executors >>> will already have started. >>> >>> I'm not so sure about what happens when you use dynamic allocation, >>> but these post-hoc config changes in general are not expected to take >>> effect. >>> >>> The documentation could be clearer about this (especially stuff that >>> only applies to spark-submit), but that's the gist of it. >>> >>> >>> -- >>> Marcelo >> >> > >