Hello Jeff, Thank you for looking into this for me.
Using the latest pushed docker image for 0.9.0 (image ID 92890adfadfb, built 6 weeks ago), I still see the same issue. My image has the digest "apache/zeppelin@sha256 :0691909f6884319d366f5d3a5add8802738d6240a83b2e53e980caeb6c658092". If it's not on the tip of master, could you guys please release a newer 0.9.0 image? Best Regards, Patrik Iselind On Sat, May 9, 2020 at 4:03 PM Jeff Zhang <zjf...@gmail.com> wrote: > This might be a bug of 0.8, I tried that in 0.9 (master branch), it works > for me. > > print(sc.master) > print(sc.defaultParallelism) > > --- > local[*] 8 > > > Patrik Iselind <patrik....@gmail.com> 于2020年5月9日周六 下午8:34写道: > >> Hi, >> >> First comes some background, then I have some questions. >> >> *Background* >> I'm trying out Zeppelin 0.8.2 based on the Docker image. My Docker file >> looks like this: >> >> ```Dockerfile >> FROM apache/zeppelin:0.8.2 >> >> >> # Install Java and some tools >> RUN apt-get -y update &&\ >> DEBIAN_FRONTEND=noninteractive \ >> apt -y install vim python3-pip >> >> RUN python3 -m pip install -U pyspark >> >> ENV PYSPARK_PYTHON python3 >> ENV PYSPARK_DRIVER_PYTHON python3 >> ``` >> >> When I start a section like so >> >> ```Zeppelin paragraph >> %pyspark >> >> print(sc) >> print() >> print(dir(sc)) >> print() >> print(sc.master) >> print() >> print(sc.defaultParallelism) >> ``` >> >> I get the following output >> >> ```output >> <SparkContext master=local appName=Zeppelin> ['PACKAGE_EXTENSIONS', >> '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', >> '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', >> '__getnewargs__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', >> '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', >> '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', >> '__weakref__', '_accumulatorServer', '_active_spark_context', '_batchSize', >> '_callsite', '_checkpointFile', '_conf', '_dictToJavaMap', '_do_init', >> '_ensure_initialized', '_gateway', '_getJavaStorageLevel', >> '_initialize_context', '_javaAccumulator', '_jsc', '_jvm', '_lock', >> '_next_accum_id', '_pickled_broadcast_vars', '_python_includes', >> '_repr_html_', '_temp_dir', '_unbatched_serializer', 'accumulator', >> 'addFile', 'addPyFile', 'appName', 'applicationId', 'binaryFiles', >> 'binaryRecords', 'broadcast', 'cancelAllJobs', 'cancelJobGroup', >> 'defaultMinPartitions', 'defaultParallelism', 'dump_profiles', 'emptyRDD', >> 'environment', 'getConf', 'getLocalProperty', 'getOrCreate', 'hadoopFile', >> 'hadoopRDD', 'master', 'newAPIHadoopFile', 'newAPIHadoopRDD', >> 'parallelize', 'pickleFile', 'profiler_collector', 'pythonExec', >> 'pythonVer', 'range', 'runJob', 'sequenceFile', 'serializer', >> 'setCheckpointDir', 'setJobGroup', 'setLocalProperty', 'setLogLevel', >> 'setSystemProperty', 'show_profiles', 'sparkHome', 'sparkUser', >> 'startTime', 'statusTracker', 'stop', 'textFile', 'uiWebUrl', 'union', >> 'version', 'wholeTextFiles'] local 1 >> ``` >> >> This even though the "master" property in the interpretter is set to >> "local[*]". I'd like to use all cores on my machine. To do that I have to >> explicitly create the "spark.master" property in the spark >> interpretter with the value "local[*]", then I get >> >> ```new output >> <SparkContext master=local[*] appName=Zeppelin> ['PACKAGE_EXTENSIONS', >> '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', >> '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', >> '__getnewargs__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', >> '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', >> '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', >> '__weakref__', '_accumulatorServer', '_active_spark_context', '_batchSize', >> '_callsite', '_checkpointFile', '_conf', '_dictToJavaMap', '_do_init', >> '_ensure_initialized', '_gateway', '_getJavaStorageLevel', >> '_initialize_context', '_javaAccumulator', '_jsc', '_jvm', '_lock', >> '_next_accum_id', '_pickled_broadcast_vars', '_python_includes', >> '_repr_html_', '_temp_dir', '_unbatched_serializer', 'accumulator', >> 'addFile', 'addPyFile', 'appName', 'applicationId', 'binaryFiles', >> 'binaryRecords', 'broadcast', 'cancelAllJobs', 'cancelJobGroup', >> 'defaultMinPartitions', 'defaultParallelism', 'dump_profiles', 'emptyRDD', >> 'environment', 'getConf', 'getLocalProperty', 'getOrCreate', 'hadoopFile', >> 'hadoopRDD', 'master', 'newAPIHadoopFile', 'newAPIHadoopRDD', >> 'parallelize', 'pickleFile', 'profiler_collector', 'pythonExec', >> 'pythonVer', 'range', 'runJob', 'sequenceFile', 'serializer', >> 'setCheckpointDir', 'setJobGroup', 'setLocalProperty', 'setLogLevel', >> 'setSystemProperty', 'show_profiles', 'sparkHome', 'sparkUser', >> 'startTime', 'statusTracker', 'stop', 'textFile', 'uiWebUrl', 'union', >> 'version', 'wholeTextFiles'] local[*] 8 >> ``` >> This is what I want. >> >> *The Questions* >> >> - Why is the "master" property not used in the created SparkContext? >> - How do I add the spark.master property to the docker image? >> >> >> Any hint or support you can provide would be greatly appreciated. >> >> Yours Sincerely, >> Patrik Iselind >> > > > -- > Best Regards > > Jeff Zhang >