Re: How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

2015-07-18 Thread Naveen Madhire
I am facing the same issue, i tried this but getting compilation error for the "$" in the explode function So, I had to modify to the below to make it work. df.select(explode(new Column("entities.user_mentions")).as("mention")) On Wed, Jun 24, 2015 at 2:48 PM, Michael Armbrust wrote: > Star

Re: How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

2015-06-24 Thread Michael Armbrust
Starting in Spark 1.4 there is also an explode that you can use directly from the select clause (much like in HiveQL): import org.apache.spark.sql.functions._ df.select(explode($"entities.user_mentions").as("mention")) Unlike standard HiveQL, you can also include other attributes in the select or

Re: How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

2015-06-24 Thread Yin Huai
The function accepted by explode is f: Row => TraversableOnce[A]. Seems user_mentions is an array of structs. So, can you change your pattern matching to the following? case Row(rows: Seq[_]) => rows.asInstanceOf[Seq[Row]].map(elem => ...) On Wed, Jun 24, 2015 at 5:27 AM, Gustavo Arjones wrote:

How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames

2015-06-24 Thread Gustavo Arjones
Hi All, I am using the new Apache Spark version 1.4.0 Data-frames API to extract information from Twitter's Status JSON, mostly focused on the Entities Object - the relevant part to this question is showed below: { ... ... "entities": {