RE: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

Felix Cheung Mon, 10 Aug 2015 08:48:57 -0700

Could you elaborate? Are you referring to working around this issue?The fix for 
this has been merged.


> From: [email protected]
> Date: Mon, 10 Aug 2015 11:48:13 +0000
> Subject: Re: [jira] [Created] (ZEPPELIN-185) z.show does not work on 
> DataFrame in pyspark
> To: [email protected]
> 
> Does anyone knows how to solve this one? my users are using python and
> iterating through the DF each time is not useful
> Eran
> 
> On Sat, Jul 25, 2015 at 10:06 PM Felix Cheung (JIRA) <[email protected]>
> wrote:
> 
> > Felix Cheung created ZEPPELIN-185:
> > -------------------------------------
> >
> >              Summary: z.show does not work on DataFrame in pyspark
> >                  Key: ZEPPELIN-185
> >                  URL: https://issues.apache.org/jira/browse/ZEPPELIN-185
> >              Project: Zeppelin
> >           Issue Type: Bug
> >           Components: Core, Interpreters
> >     Affects Versions: 0.6.0
> >             Reporter: Felix Cheung
> >             Assignee: Felix Cheung
> >
> >
> > I’ve tested this out and found these issues. Firstly,
> >
> >
> > http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=createdataframe#pyspark.sql.SQLContext.createDataFrame
> > # Code should be changed to this – it does not work in pyspark CLI
> > otherwise
> > rdd = sc.parallelize(["1","2","3"])
> > Data = Row('first')
> > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
> >
> > Secondly,
> > z.show() doesn’t seem to work properly in Python – I see the same error
> > below: “AttributeError: 'DataFrame' object has no attribute
> > '_get_object_id'"
> > #Python/PySpark – doesn’t work
> > rdd = sc.parallelize(["1","2","3"])
> > Data = Row('first')
> > df = sqlContext.createDataFrame(rdd.map(lambda d: Data(d)))
> > print df
> > print df.collect()
> > z.show(df)
> >         AttributeError: 'DataFrame' object has no attribute
> > ‘_get_object_id'
> >
> > #Scala – this works
> > val a = sc.parallelize(List("1", "2", "3"))
> > val df = a.toDF()
> > z.show(df)
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >

RE: [jira] [Created] (ZEPPELIN-185) z.show does not work on DataFrame in pyspark

Reply via email to