I’m using this .zip https://github.com/apache/incubator-zeppelin
Thanks Karthik From: moon soo Lee [mailto:[email protected]] Sent: Wednesday, July 8, 2015 1:37 PM To: [email protected] Subject: Re: Not able to see registered table records and Pyspark not working Are you building on latest master? On Wed, Jul 8, 2015 at 1:34 PM Vadla, Karthik <[email protected]<mailto:[email protected]>> wrote: Hi Moon, Yeah I tried below command. The build was successful, but at the end I got warning message as below [WARNING] The requested profile "pyspark" could not be activated because it does not exist. Pyspark exists on machine. Do I need to anything further. Thanks Karthik From: moon soo Lee [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, July 8, 2015 10:58 AM To: [email protected]<mailto:[email protected]> Subject: Re: Not able to see registered table records and Pyspark not working Hi I was meaning adding -Ppyspark profile, like mvn clean package -Pspark-1.3 -Ppyspark -Dhadoop.version=2.6.0-cdh5.4.0 -Phadoop-2.6 –DskipTests Thanks, moon On Wed, Jul 8, 2015 at 10:43 AM Vadla, Karthik <[email protected]<mailto:[email protected]>> wrote: Hi Moon, You mean to say I need to build something like this. mvn clean package -Ppyspark-1.3 -Dhadoop.version=2.6.0-cdh5.4.0 -Phadoop-2.6 –DskipTests I have built my zeppelin with below command previously mvn clean package -Pspark-1.3 -Dhadoop.version=2.6.0-cdh5.4.0 -Phadoop-2.6 –DskipTests Thanks Karthik From: moon soo Lee [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, July 8, 2015 10:20 AM To: [email protected]<mailto:[email protected]> Subject: Re: Not able to see registered table records and Pyspark not working Hi, If you build latest master branch with -Ppyspark maven profile, it'll help pyspark work without setting those environment variables. Hope this helps. Best, moon On Tue, Jul 7, 2015 at 3:47 PM Vadla, Karthik <[email protected]<mailto:[email protected]>> wrote: Hi All, This part is commented in zeppelin-env.sh in my conf folder. # Pyspark (supported with Spark 1.2.1 and above) # To configure pyspark, you need to set spark distribution's path to 'spark.home' property in Interpreter setting screen in Zeppelin GUI # export PYSPARK_PYTHON # path to the python command. must be the same path on the driver(Zeppelin) and all workers. # export PYTHONPATH # extra PYTHONPATH. Can you anyone help how to setup those. Appreciate your help. Thanks Karthik From: Vadla, Karthik [mailto:[email protected]<mailto:[email protected]>] Sent: Tuesday, July 7, 2015 3:29 PM To: [email protected]<mailto:[email protected]> Subject: RE: Not able to see registered table records and Pyspark not working Hi Moon, Thanks for that. The problem is with my parsing. I resolved it. I have another question to ask. I’m just trying to run print command using pyspark interpreter. It is not responding . When I look at the log, I don’t have information except this INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41} SchedulerFactory.java[jobStarted]:132) - Job paragraph_1436305204170_601291630 started by scheduler remoteinterpreter_267235421 INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41} Paragraph.java[jobRun]:194) - run paragraph 20150707-144004_475199059 using pyspark org.apache.zeppelin.interpreter.LazyOpenInterpreter@33a625a7<mailto:org.apache.zeppelin.interpreter.LazyOpenInterpreter@33a625a7> INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41} Paragraph.java[jobRun]:211) - RUN : list=range(1,10) print(list) INFO [2015-07-07 15:19:18,060] ({Thread-255} NotebookServer.java[broadcast]:251) - SEND >> PROGRESS INFO [2015-07-07 15:19:18,678] ({Thread-255} NotebookServer.java[broadcast]:251) - SEND >> PROGRESS INFO [2015-07-07 15:19:19,278] ({Thread-255} NotebookServer.java[broadcast]:251) - SEND >> PROGRESS INFO [2015-07-07 15:19:19,879] ({Thread-255} NotebookServer.java[broadcast]:251) - SEND >> PROGRESS Do I need to do any config settings in zeppelin-env.sh or zeppelin-site.xml??? Thanks Karthik From: moon soo Lee [mailto:[email protected]] Sent: Friday, July 3, 2015 2:31 PM To: [email protected]<mailto:[email protected]> Subject: Re: Not able to see registered table records Hi, Could you try this branch? https://github.com/apache/incubator-zeppelin/pull/136 It'll give you better stacktrace than just displaying "java.lang.reflect.InvocationTargetException" Thanks, moon On Thu, Jul 2, 2015 at 10:34 AM Vadla, Karthik <[email protected]<mailto:[email protected]>> wrote: Hi All. I just registered a tables using below code val eduText = sc.textFile("hdfs://ip.address/user/karthik/education.csv") case class Education(unitid:Integer, instnm:String, addr : String, city : String, stabbr : String, zip : Integer) val education = eduText.map(s=>s.split(",")).filter(s=>s(0)!="UNITID").map( s=>Education(s(0).toInt, s(1).replaceAll("\"", ""), s(2).replaceAll("\"", ""), s(3).replaceAll("\"", ""), s(4).replaceAll("\"", ""), s(5).replaceAll("\"", "").toInt ) ) // Below line works only in spark 1.3.0. // For spark 1.1.x and spark 1.2.x, // use bank.registerTempTable("bank") instead. education.toDF().registerTempTable("education") when I run “%sql show tables” It displays table “education” But when I try to run the command “%sql select count(*) from education”. It is throwing below error. java.lang.reflect.InvocationTargetException Can anyone help me with this. Appreciate your help. And I enclosed .csv file used to register table. Thanks Karthik
