Ok great I’ll give that a shot -

Thanks for all the help

> On Apr 14, 2018, at 12:08 PM, Gene Pang <gene.p...@gmail.com> wrote:
> 
> Yes, I think that is the case. I haven't tried that before, but it should 
> work.
> 
> Thanks,
> Gene
> 
> On Fri, Apr 13, 2018 at 11:32 AM, Jason Boorn <jbo...@gmail.com 
> <mailto:jbo...@gmail.com>> wrote:
> Hi Gene - 
> 
> Are you saying that I just need to figure out how to get the Alluxio jar into 
> the classpath of my parent application?  If it shows up in the classpath then 
> Spark will automatically know that it needs to use it when communicating with 
> Alluxio?
> 
> Apologies for going back-and-forth on this - I feel like my particular use 
> case is clouding what is already a tricky issue.
> 
>> On Apr 13, 2018, at 2:26 PM, Gene Pang <gene.p...@gmail.com 
>> <mailto:gene.p...@gmail.com>> wrote:
>> 
>> Hi Jason,
>> 
>> Alluxio does work with Spark in master=local mode. This is because both 
>> spark-submit and spark-shell have command-line options to set the classpath 
>> for the JVM that is being started.
>> 
>> If you are not using spark-submit or spark-shell, you will have to figure 
>> out how to configure that JVM instance with the proper properties.
>> 
>> Thanks,
>> Gene
>> 
>> On Fri, Apr 13, 2018 at 10:47 AM, Jason Boorn <jbo...@gmail.com 
>> <mailto:jbo...@gmail.com>> wrote:
>> Ok thanks - I was basing my design on this:
>> 
>> https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html
>>  
>> <https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html>
>> 
>> Wherein it says:
>> Once the SparkSession is instantiated, you can configure Spark’s runtime 
>> config properties. 
>> Apparently the suite of runtime configs you can change does not include 
>> classpath.  
>> 
>> So the answer to my original question is basically this:
>> 
>> When using local (pseudo-cluster) mode, there is no way to add external jars 
>> to the spark instance.  This means that Alluxio will not work with Spark 
>> when Spark is run in master=local mode.
>> 
>> Thanks again - often getting a definitive “no” is almost as good as a yes.  
>> Almost ;)
>> 
>>> On Apr 13, 2018, at 1:21 PM, Marcelo Vanzin <van...@cloudera.com 
>>> <mailto:van...@cloudera.com>> wrote:
>>> 
>>> There are two things you're doing wrong here:
>>> 
>>> On Thu, Apr 12, 2018 at 6:32 PM, jb44 <jbo...@gmail.com 
>>> <mailto:jbo...@gmail.com>> wrote:
>>>> Then I can add the alluxio client library like so:
>>>> sparkSession.conf.set("spark.driver.extraClassPath", ALLUXIO_SPARK_CLIENT)
>>> 
>>> First one, you can't modify JVM configuration after it has already
>>> started. So this line does nothing since it can't re-launch your
>>> application with a new JVM.
>>> 
>>>> sparkSession.conf.set("spark.executor.extraClassPath", 
>>>> ALLUXIO_SPARK_CLIENT)
>>> 
>>> There is a lot of configuration that you cannot set after the
>>> application has already started. For example, after the session is
>>> created, most probably this option will be ignored, since executors
>>> will already have started.
>>> 
>>> I'm not so sure about what happens when you use dynamic allocation,
>>> but these post-hoc config changes in general are not expected to take
>>> effect.
>>> 
>>> The documentation could be clearer about this (especially stuff that
>>> only applies to spark-submit), but that's the gist of it.
>>> 
>>> 
>>> -- 
>>> Marcelo
>> 
>> 
> 
> 

Reply via email to