Hi Jeff, I am using the embedded mode and there is no SPARK_HOME set for the user running the daemon in the server. On my local computer, I am also running the embedded spark, but I do have also a local installation of spark and the SPARK_HOME env var is set. My local installation is correctly setup so I can read and write files with wasbs protocol.
I just compared the environments through the spark UIs, and I noticed that on my local computer some parameters are mixed with my local spark installation. I have for example all of my local spark installation jars loaded in the classpath. So the embedded spark can be messed up if the SPARK_HOME env var is setup. An then it seems like using azure storage with wasbs protocol do not work Out Of The Box. I was confused by the fact that all the needed jar files are present in the lib directory of the zeppelin installation folder. Actually, to make azure storage working, one must copy the needed jars from the lib directory to the interpreter dep directory zeppelin-0.8.0-bin-all$ cp lib/*azure* interpreter/spark/dep/ And setup the interpreter with the following parameters : spark.hadoop.fs.azure org.apache.hadoop.fs.azure.NativeAzureFileSystem spark.hadoop.fs.azure.account.key.<mystorageaccount>.blob.core.windows.net <mykey> Metin 1:29 am, Jeff Zhang > > > Do you specify SPARK_HOME or just using the local embedded mode of spark ? > Metin OSMAN <mos...@mixdata.com (mailto:mos...@mixdata.com)>于2018年10月4日周四 > 上午1:39写道: > > Hi, > > > > I have downloaded and setup zeppelin on my local Ubuntu 18.04 computer, and > > I successfully managed to open file on Azure Storage with spark interpreter > > out of the box. > > Then I have installed the same package on a Ubuntu 14.04 server. > > When I try running a simple spark read parquet from an azure storage > > account, I get a java.io.IOException: No FileSystem for scheme: wasbs > > > > sqlContext.read.parquet("wasbs://mycontai...@myacountsa.blob.core.windows.net/mypath > > (http://mycontai...@myacountsa.blob.core.windows.net/mypath)") > > java.io.IOException: No FileSystem for scheme: wasbs at > > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2304) at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2311) at > > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90) at > > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350) at > > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332) at > > org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369) at > > org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at > > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350) > > at > > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:348) > > at > > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > > at > > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > > at scala.collection.immutable.List.foreach(List.scala:381) at > > scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > > at scala.collection.immutable.List.flatMap(List.scala:344) at > > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:348) > > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at > > org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:559) at > > org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:543) ... > > 52 elided > > I copied the interpreter.json file from my local computer to the server but > > that has not changed anything. > > Should it be working ootb or the fact that it worked on my local computer > > may be due to some local spark configuration or environment variables ? > > Thank you, > > Metin > > >