Your zeppelin isn't finding the Spark cluster. Did you try re binding? Also, I don't know your Cloudera setup, but I feel like that usually comes pre-baked with YARN. Per those instructions, did you try setting it to 'yarn-client' ?
Trevor Grant Data Scientist https://github.com/rawkintrevo http://stackexchange.com/users/3002022/rawkintrevo http://trevorgrant.org *"Fortunate is he, who is able to know the causes of things." -Virgil* On Tue, Dec 8, 2015 at 7:09 PM, Hoc Phan <quang...@yahoo.com> wrote: > Hi all > > I am using Cloudera 5.5 Express with Spark 1.5 installed across the > cluster. I have tested Pyspark in command line and it works. So my cluster > is fine > However when I use Zeppelin with Spark cluster, I got error below just > doing simple thing like: > > %pyspark > print "abcd" > > *Error:* > > org.apache.thrift.transport.TTransportException at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:220) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:205) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:211) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) > at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:207) at > org.apache.zeppelin.scheduler.Job.run(Job.java:170) at > org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:304) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > If I set local[*], it is fine. If I set master > as spark://cdhe1master.fbdl.local:7077, it gave error above > I checked my master hostname and port, all are correct and working > > I followed instructions here > https://zeppelin.incubator.apache.org/docs/0.5.5-incubating/interpreter/spark.html > and > have SPARK_HOME=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/spark > > Any idea? > > >