Am I able to do a join on an exploded field? Like if I have another object:
{ "streetNumber":"2300", "locationName":"The Big Building"} and I want to join with the previous json by the locations[].number field- is that possible? On Tue, Oct 28, 2014 at 9:31 PM, Corey Nolet <cjno...@gmail.com> wrote: > Michael, > > Awesome, this is what I was looking for. So it's possible to use hive > dialect in a regular sql context? This is what was confusing to me- the > docs kind of allude to it but don't directly point it out. > > On Tue, Oct 28, 2014 at 9:30 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> You can do this: >> >> $ sbt/sbt hive/console >> >> scala> jsonRDD(sparkContext.parallelize("""{ "name":"John", "age":53, >> "locations": [{ "street":"Rodeo Dr", "number":2300 }]}""" :: >> Nil)).registerTempTable("people") >> >> scala> sql("SELECT name FROM people LATERAL VIEW explode(locations) l AS >> location WHERE location.number = 2300").collect() >> res0: Array[org.apache.spark.sql.Row] = Array([John]) >> >> This will double show people who have more than one matching address. >> >> On Tue, Oct 28, 2014 at 5:52 PM, Corey Nolet <cjno...@gmail.com> wrote: >> >>> So it wouldn't be possible to have a json string like this: >>> >>> { "name":"John", "age":53, "locations": [{ "street":"Rodeo Dr", >>> "number":2300 }]} >>> >>> And query all people who have a location with number = 2300? >>> >>> >>> >>> >>> On Tue, Oct 28, 2014 at 5:30 PM, Michael Armbrust < >>> mich...@databricks.com> wrote: >>> >>>> On Tue, Oct 28, 2014 at 2:19 PM, Corey Nolet <cjno...@gmail.com> wrote: >>>> >>>>> Is it possible to select if, say, there was an addresses field that >>>>> had a json array? >>>>> >>>> You can get the Nth item by "address".getItem(0). If you want to walk >>>> through the whole array look at LATERAL VIEW EXPLODE in HiveQL >>>> >>>> >>> >>> >> >