Hi,

I am creating sparkcontext in a SPARK standalone cluster as mentioned here:
http://spark.apache.org/docs/latest/spark-standalone.html using the
following code:

--------------------------------------------------------------------------------------------------------------------------
sc.stop()
conf = SparkConf().set( 'spark.driver.allowMultipleContexts' , False) \
                  .setMaster("spark://hostname:7077") \
                  .set('spark.shuffle.service.enabled', True) \
                  .set('spark.dynamicAllocation.enabled','true') \
                  .set('spark.executor.memory','20g') \
                  .set('spark.driver.memory', '4g') \

.set('spark.default.parallelism',(multiprocessing.cpu_count() -1 ))
conf.getAll()
sc = SparkContext(conf = conf)

-----(we should definitely be able to optimise the configuration but that
is not the point here) ---

I am not able to use packages, a list of which is mentioned here
http://spark-packages.org, using this method.

Where as if I use the standard "pyspark --packages" option then the
packages load just fine.

I will be grateful if someone could kindly let me know how to load packages
when starting a cluster as mentioned above.


Regards,
Gourav Sengupta

Reply via email to