Spark master is set to `local[*]` by default. Here is corresponding piece form interpreter-settings.json for Spark interpreter:
"master": { "envName": "MASTER", "propertyName": "spark.master", "defaultValue": "local[*]", "description": "Spark master uri. local | yarn-client | yarn-cluster | spark master address of standalone mode, ex) spark://master_host:7077", "type": "string" }, Patrik Iselind at "Sun, 10 May 2020 20:31:08 +0200" wrote: PI> Hi Jeff, PI> I've tried the release from http://zeppelin.apache.org/download.html, both in a docker and without a docker. They both have the same issue as PI> previously described. PI> Can I somehow set spark.master to "local[*]" in zeppelin, perhaps using some environment variable? PI> When is the next Zeppelin 0.9.0 docker image planned to be released? PI> Best Regards, PI> Patrik Iselind PI> On Sun, May 10, 2020 at 9:26 AM Jeff Zhang <zjf...@gmail.com> wrote: PI> Hi Patric, PI> PI> Do you mind to try the 0.9.0-preview, it might be an issue of docker container. PI> PI> http://zeppelin.apache.org/download.html PI> Patrik Iselind <patrik....@gmail.com> 于2020年5月10日周日上午2:30写道: PI> PI> Hello Jeff, PI> PI> Thank you for looking into this for me. PI> PI> Using the latest pushed docker image for 0.9.0 (image ID 92890adfadfb, built 6 weeks ago), I still see the same issue. My image has PI> the digest "apache/zeppelin@sha256:0691909f6884319d366f5d3a5add8802738d6240a83b2e53e980caeb6c658092". PI> PI> If it's not on the tip of master, could you guys please release a newer 0.9.0 image? PI> PI> Best Regards, PI> Patrik Iselind PI> On Sat, May 9, 2020 at 4:03 PM Jeff Zhang <zjf...@gmail.com> wrote: PI> PI> This might be a bug of 0.8, I tried that in 0.9 (master branch), it works for me. PI> PI> print(sc.master) PI> print(sc.defaultParallelism) PI> PI> --- PI> local[*] 8 PI> Patrik Iselind <patrik....@gmail.com> 于2020年5月9日周六下午8:34写道: PI> PI> Hi, PI> PI> First comes some background, then I have some questions. PI> PI> Background PI> I'm trying out Zeppelin 0.8.2 based on the Docker image. My Docker file looks like this: PI> PI> ```Dockerfile PI> FROM apache/zeppelin:0.8.2 PI> PI> # Install Java and some tools PI> RUN apt-get -y update &&\ PI> DEBIAN_FRONTEND=noninteractive \ PI> apt -y install vim python3-pip PI> PI> RUN python3 -m pip install -U pyspark PI> PI> ENV PYSPARK_PYTHON python3 PI> ENV PYSPARK_DRIVER_PYTHON python3 PI> ``` PI> PI> When I start a section like so PI> PI> ```Zeppelin paragraph PI> %pyspark PI> PI> print(sc) PI> print() PI> print(dir(sc)) PI> print() PI> print(sc.master) PI> print() PI> print(sc.defaultParallelism) PI> ``` PI> PI> I get the following output PI> PI> ```output PI> <SparkContext master=local appName=Zeppelin> ['PACKAGE_EXTENSIONS', '__class__', '__delattr__', '__dict__', '__dir__', PI> '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', PI> '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', PI> '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_accumulatorServer', '_active_spark_context', PI> '_batchSize', '_callsite', '_checkpointFile', '_conf', '_dictToJavaMap', '_do_init', '_ensure_initialized', '_gateway', PI> '_getJavaStorageLevel', '_initialize_context', '_javaAccumulator', '_jsc', '_jvm', '_lock', '_next_accum_id', PI> '_pickled_broadcast_vars', '_python_includes', '_repr_html_', '_temp_dir', '_unbatched_serializer', 'accumulator', 'addFile', PI> 'addPyFile', 'appName', 'applicationId', 'binaryFiles', 'binaryRecords', 'broadcast', 'cancelAllJobs', 'cancelJobGroup', PI> 'defaultMinPartitions', 'defaultParallelism', 'dump_profiles', 'emptyRDD', 'environment', 'getConf', 'getLocalProperty', PI> 'getOrCreate', 'hadoopFile', 'hadoopRDD', 'master', 'newAPIHadoopFile', 'newAPIHadoopRDD', 'parallelize', 'pickleFile', PI> 'profiler_collector', 'pythonExec', 'pythonVer', 'range', 'runJob', 'sequenceFile', 'serializer', 'setCheckpointDir', PI> 'setJobGroup', 'setLocalProperty', 'setLogLevel', 'setSystemProperty', 'show_profiles', 'sparkHome', 'sparkUser', 'startTime', PI> 'statusTracker', 'stop', 'textFile', 'uiWebUrl', 'union', 'version', 'wholeTextFiles'] local 1 PI> ``` PI> PI> This even though the "master" property in the interpretter is set to "local[*]". I'd like to use all cores on my machine. To PI> do that I have to explicitly create the "spark.master" property in the spark interpretter with the value "local[*]", then I PI> get PI> PI> ```new output PI> <SparkContext master=local[*] appName=Zeppelin> ['PACKAGE_EXTENSIONS', '__class__', '__delattr__', '__dict__', '__dir__', PI> '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', PI> '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', PI> '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_accumulatorServer', '_active_spark_context', PI> '_batchSize', '_callsite', '_checkpointFile', '_conf', '_dictToJavaMap', '_do_init', '_ensure_initialized', '_gateway', PI> '_getJavaStorageLevel', '_initialize_context', '_javaAccumulator', '_jsc', '_jvm', '_lock', '_next_accum_id', PI> '_pickled_broadcast_vars', '_python_includes', '_repr_html_', '_temp_dir', '_unbatched_serializer', 'accumulator', 'addFile', PI> 'addPyFile', 'appName', 'applicationId', 'binaryFiles', 'binaryRecords', 'broadcast', 'cancelAllJobs', 'cancelJobGroup', PI> 'defaultMinPartitions', 'defaultParallelism', 'dump_profiles', 'emptyRDD', 'environment', 'getConf', 'getLocalProperty', PI> 'getOrCreate', 'hadoopFile', 'hadoopRDD', 'master', 'newAPIHadoopFile', 'newAPIHadoopRDD', 'parallelize', 'pickleFile', PI> 'profiler_collector', 'pythonExec', 'pythonVer', 'range', 'runJob', 'sequenceFile', 'serializer', 'setCheckpointDir', PI> 'setJobGroup', 'setLocalProperty', 'setLogLevel', 'setSystemProperty', 'show_profiles', 'sparkHome', 'sparkUser', 'startTime', PI> 'statusTracker', 'stop', 'textFile', 'uiWebUrl', 'union', 'version', 'wholeTextFiles'] local[*] 8 PI> ``` PI> This is what I want. PI> PI> The Questions PI> @ Why is the "master" property not used in the created SparkContext? PI> @ How do I add the spark.master property to the docker image? PI> PI> Any hint or support you can provide would be greatly appreciated. PI> PI> Yours Sincerely, PI> Patrik Iselind PI> -- PI> Best Regards PI> PI> Jeff Zhang PI> -- PI> Best Regards PI> PI> Jeff Zhang -- With best wishes, Alex Ott http://alexott.net/ Twitter: alexott_en (English), alexott (Russian)