You can do this:

$ sbt/sbt hive/console

scala> jsonRDD(sparkContext.parallelize("""{ "name":"John", "age":53,
"locations": [{ "street":"Rodeo Dr", "number":2300 }]}""" ::
Nil)).registerTempTable("people")

scala> sql("SELECT name FROM people LATERAL VIEW explode(locations) l AS
location WHERE location.number = 2300").collect()
res0: Array[org.apache.spark.sql.Row] = Array([John])

This will double show people who have more than one matching address.

On Tue, Oct 28, 2014 at 5:52 PM, Corey Nolet <cjno...@gmail.com> wrote:

> So it wouldn't be possible to have a json string like this:
>
> { "name":"John", "age":53, "locations": [{ "street":"Rodeo Dr",
> "number":2300 }]}
>
> And query all people who have a location with number = 2300?
>
>
>
>
> On Tue, Oct 28, 2014 at 5:30 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> On Tue, Oct 28, 2014 at 2:19 PM, Corey Nolet <cjno...@gmail.com> wrote:
>>
>>> Is it possible to select if, say, there was an addresses field that had
>>> a json array?
>>>
>> You can get the Nth item by "address".getItem(0).  If you want to walk
>> through the whole array look at LATERAL VIEW EXPLODE in HiveQL
>>
>>
>
>

Reply via email to