Michael, Awesome, this is what I was looking for. So it's possible to use hive dialect in a regular sql context? This is what was confusing to me- the docs kind of allude to it but don't directly point it out.
On Tue, Oct 28, 2014 at 9:30 PM, Michael Armbrust <mich...@databricks.com> wrote: > You can do this: > > $ sbt/sbt hive/console > > scala> jsonRDD(sparkContext.parallelize("""{ "name":"John", "age":53, > "locations": [{ "street":"Rodeo Dr", "number":2300 }]}""" :: > Nil)).registerTempTable("people") > > scala> sql("SELECT name FROM people LATERAL VIEW explode(locations) l AS > location WHERE location.number = 2300").collect() > res0: Array[org.apache.spark.sql.Row] = Array([John]) > > This will double show people who have more than one matching address. > > On Tue, Oct 28, 2014 at 5:52 PM, Corey Nolet <cjno...@gmail.com> wrote: > >> So it wouldn't be possible to have a json string like this: >> >> { "name":"John", "age":53, "locations": [{ "street":"Rodeo Dr", >> "number":2300 }]} >> >> And query all people who have a location with number = 2300? >> >> >> >> >> On Tue, Oct 28, 2014 at 5:30 PM, Michael Armbrust <mich...@databricks.com >> > wrote: >> >>> On Tue, Oct 28, 2014 at 2:19 PM, Corey Nolet <cjno...@gmail.com> wrote: >>> >>>> Is it possible to select if, say, there was an addresses field that had >>>> a json array? >>>> >>> You can get the Nth item by "address".getItem(0). If you want to walk >>> through the whole array look at LATERAL VIEW EXPLODE in HiveQL >>> >>> >> >> >