Re: Unable to run spark interpreter on Zeppelin

moon soo Lee Thu, 25 Jun 2015 21:14:52 -0700

Hi,

If i guess based on error


Caused by: java.lang.UnsupportedOperationException: Not implemented by the
TFS FileSystem implementation

It looks like you need add libraries for the TFS FileSystem into the
classapath of Zeppelin's spark interpreter.

Best,
moon


On Thu, Jun 25, 2015 at 3:36 PM Udit Mehta <ume...@groupon.com> wrote:

> Hi,
>
> I am facing some issues while running zeppelin spark interpreter in the
> yarn client mode. I get the following error in the
> zeppelin-interpreter-spark*.log:
>
> ERROR [2015-06-25 22:17:56,374] ({pool-1-thread-4}
> ProcessFunction.java[process]:41) - Internal error processing getProgress
> org.apache.zeppelin.interpreter.InterpreterException:
> java.lang.UnsupportedOperationException: Not implemented by the TFS
> FileSystem implementation
>     at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:75)
>     at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>     at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:109)
>     at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:299)
>     at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:938)
>     at
> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:923)
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
>
> Caused by: java.lang.UnsupportedOperationException: Not implemented by the
> TFS FileSystem implementation
>     at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:216)
>     at
> org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2564)
>     at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2574)
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>     at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
>     at
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:216)
>     at
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:384)
>     at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:102)
>     at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:58)
>     at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:381)
>     at
> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:276)
>     at
> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>     at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:398)
>     at
> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:73)
>     ... 11 more
>
>
> Im running the interpreter in yarn-client mode. Here is the command i used
> to build the zeppelin jar:
>
> mvn clean install -DskipTests -Pspark-1.3 -Dspark.version=1.3.1
> -Phadoop-2.6 -Pyarn
>
> Here is my conf/interpreter.json (spark section):
>
>>         "spark.yarn.access.namenodes":
>> "hdfs://namenode1.snc1:8032,hdfs://namenode2.snc1:8032",
>>         "master": "yarn-client",
>>         "zeppelin.spark.maxResult": "1000",
>>         "zeppelin.dep.localrepo": "local-repo",
>>         "spark.app.name": "Zeppelin-umehta",
>>         "spark.yarn.queue": "public",
>>         "spark.executor.memory": "512m",
>>         "zeppelin.spark.useHiveContext": "true",
>>         "zeppelin.spark.concurrentSQL": "false",
>>         "spark.home": "/path/to/spark-1.3"
>>
>
> And heres my conf/zeppelin-env
>
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>> export ZEPPELIN_PORT=10008
>> export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.2.0.0-2041"
>> export
>> ZEPPELIN_INTERPRETER_DIR=/home/umehta/zeppelin-0.6.0-incubating-SNAPSHOT/interpreter
>> export
>> ZEPPELIN_INTERPRETERS=org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter
>>
>>
>
> Im using hdp 2.2.0.0-2041 on my cluster.
>
> Does anyone have any insights on this?
>
> Thanks in advance,
> Udit
>

Re: Unable to run spark interpreter on Zeppelin

Reply via email to