Hi, If i guess based on error
Caused by: java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation It looks like you need add libraries for the TFS FileSystem into the classapath of Zeppelin's spark interpreter. Best, moon On Thu, Jun 25, 2015 at 3:36 PM Udit Mehta <ume...@groupon.com> wrote: > Hi, > > I am facing some issues while running zeppelin spark interpreter in the > yarn client mode. I get the following error in the > zeppelin-interpreter-spark*.log: > > ERROR [2015-06-25 22:17:56,374] ({pool-1-thread-4} > ProcessFunction.java[process]:41) - Internal error processing getProgress > org.apache.zeppelin.interpreter.InterpreterException: > java.lang.UnsupportedOperationException: Not implemented by the TFS > FileSystem implementation > at > org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:75) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > Caused by: java.lang.UnsupportedOperationException: Not implemented by the > TFS FileSystem implementation > at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216) > at > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564) > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) > at > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:216) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384) > at > org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102) > at > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:58) > at > org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:381) > at > org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:276) > at > org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) > at > org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:398) > at > org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73) > ... 11 more > > > Im running the interpreter in yarn-client mode. Here is the command i used > to build the zeppelin jar: > > mvn clean install -DskipTests -Pspark-1.3 -Dspark.version=1.3.1 > -Phadoop-2.6 -Pyarn > > Here is my conf/interpreter.json (spark section): > >> "spark.yarn.access.namenodes": >> "hdfs://namenode1.snc1:8032,hdfs://namenode2.snc1:8032", >> "master": "yarn-client", >> "zeppelin.spark.maxResult": "1000", >> "zeppelin.dep.localrepo": "local-repo", >> "spark.app.name": "Zeppelin-umehta", >> "spark.yarn.queue": "public", >> "spark.executor.memory": "512m", >> "zeppelin.spark.useHiveContext": "true", >> "zeppelin.spark.concurrentSQL": "false", >> "spark.home": "/path/to/spark-1.3" >> > > And heres my conf/zeppelin-env > >> export HADOOP_CONF_DIR=/etc/hadoop/conf >> export ZEPPELIN_PORT=10008 >> export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.2.0.0-2041" >> export >> ZEPPELIN_INTERPRETER_DIR=/home/umehta/zeppelin-0.6.0-incubating-SNAPSHOT/interpreter >> export >> ZEPPELIN_INTERPRETERS=org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter >> >> > > Im using hdp 2.2.0.0-2041 on my cluster. > > Does anyone have any insights on this? > > Thanks in advance, > Udit >