Re: Not able to see registered table records and Pyspark not working

moon soo Lee Wed, 08 Jul 2015 10:20:28 -0700

Hi,

If you build latest master branch with -Ppyspark maven profile, it'll help
pyspark work without setting those environment variables.
Hope this helps.


Best,
moon

On Tue, Jul 7, 2015 at 3:47 PM Vadla, Karthik <[email protected]>
wrote:

>  Hi All,
>
>
>
> This part is commented in *zeppelin-env.sh* in my conf folder.
>
>
>
> # Pyspark (supported with Spark 1.2.1 and above)
>
> # To configure pyspark, you need to set spark distribution's path to
> 'spark.home' property in Interpreter setting screen in Zeppelin GUI
>
> # export PYSPARK_PYTHON          # path to the python command. must be the
> same path on the driver(Zeppelin) and all workers.
>
> # export PYTHONPATH              # extra PYTHONPATH.
>
>
>
> Can you anyone help how to setup those.
>
>
>
> Appreciate your help.
>
>
>
> Thanks
>
> Karthik
>
>
>
> *From:* Vadla, Karthik [mailto:[email protected]]
> *Sent:* Tuesday, July 7, 2015 3:29 PM
> *To:* [email protected]
> *Subject:* RE: Not able to see registered table records and Pyspark not
> working
>
>
>
> Hi Moon,
>
>
>
> Thanks for that.
> The problem is with my parsing. I resolved it.
>
>
>
> I have another question to ask.
>
> I’m just trying to run *print command using pyspark interpreter. *
> It is not responding .
>
>
>
> When I look at the log, I don’t have information except this
>
>
>
> INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41}
> SchedulerFactory.java[jobStarted]:132) - Job
> paragraph_1436305204170_601291630 started by scheduler
> remoteinterpreter_267235421
>
> INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41}
> Paragraph.java[jobRun]:194) - run paragraph 20150707-144004_475199059 using
> pyspark org.apache.zeppelin.interpreter.LazyOpenInterpreter@33a625a7
>
> INFO [2015-07-07 15:19:17,702] ({pool-1-thread-41}
> Paragraph.java[jobRun]:211) - RUN : list=range(1,10)
>
> print(list)
>
> INFO [2015-07-07 15:19:18,060] ({Thread-255}
> NotebookServer.java[broadcast]:251) - SEND >> PROGRESS
>
> INFO [2015-07-07 15:19:18,678] ({Thread-255}
> NotebookServer.java[broadcast]:251) - SEND >> PROGRESS
>
> INFO [2015-07-07 15:19:19,278] ({Thread-255}
> NotebookServer.java[broadcast]:251) - SEND >> PROGRESS
>
> INFO [2015-07-07 15:19:19,879] ({Thread-255}
> NotebookServer.java[broadcast]:251) - SEND >> PROGRESS
>
>
>
>
>
> Do I need to do any config settings in *zeppelin-env.sh or
> zeppelin-site.xml*???
>
>
>
>
>
> Thanks
>
> Karthik
>
>
>
>
>
>
>
> *From:* moon soo Lee [mailto:[email protected] <[email protected]>]
> *Sent:* Friday, July 3, 2015 2:31 PM
> *To:* [email protected]
> *Subject:* Re: Not able to see registered table records
>
>
>
> Hi,
>
>
>
> Could you try this branch?
> https://github.com/apache/incubator-zeppelin/pull/136
>
>
>
> It'll give you better stacktrace than just displaying "
> java.lang.reflect.InvocationTargetException"
>
>
>
> Thanks,
>
> moon
>
>
>
> On Thu, Jul 2, 2015 at 10:34 AM Vadla, Karthik <[email protected]>
> wrote:
>
>  Hi All.
>
>
>
> I just registered a tables using below code
>
>
>
> *val eduText = sc.textFile("hdfs://ip.address/user/karthik/education.csv")*
>
>
>
> *case class Education(unitid:Integer, instnm:String, addr : String, city :
> String, stabbr : String, zip : Integer)*
>
>
>
> *val education =
> eduText.map(s=>s.split(",")).filter(s=>s(0)!="UNITID").map(*
>
> *    s=>Education(s(0).toInt, *
>
> *            s(1).replaceAll("\"", ""),*
>
> *            s(2).replaceAll("\"", ""),*
>
> *            s(3).replaceAll("\"", ""),*
>
> *            s(4).replaceAll("\"", ""),*
>
> *            s(5).replaceAll("\"", "").toInt*
>
> *        )*
>
> *)*
>
>
>
> *// Below line works only in spark 1.3.0.*
>
> *// For spark 1.1.x and spark 1.2.x,*
>
> *// use bank.registerTempTable("bank") instead.*
>
>
>
> *education.toDF().registerTempTable("education")*
>
>
>
> when I run *“%sql show tables”*
>
>
>
> It displays table “education”
>
>
>
> But when I try to run the command *“%sql select count(*) from education”.
> * It is throwing below error.
>
>
>
> java.lang.reflect.InvocationTargetException
>
>
>
>
>
>
>
> Can anyone help me with this.
>
> Appreciate your help.
>
>
>
> And I enclosed .csv file used to register table.
>
>
>
> Thanks
>
> Karthik
>
>

Re: Not able to see registered table records and Pyspark not working

Reply via email to