Re: Selecting Based on Nested Values using Language Integrated Query Syntax

Corey Nolet Tue, 28 Oct 2014 18:33:12 -0700

Michael,

Awesome, this is what I was looking for. So it's possible to use hive
dialect in a regular sql context? This is what was confusing to me- the
docs kind of allude to it but don't directly point it out.


On Tue, Oct 28, 2014 at 9:30 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> You can do this:
>
> $ sbt/sbt hive/console
>
> scala> jsonRDD(sparkContext.parallelize("""{ "name":"John", "age":53,
> "locations": [{ "street":"Rodeo Dr", "number":2300 }]}""" ::
> Nil)).registerTempTable("people")
>
> scala> sql("SELECT name FROM people LATERAL VIEW explode(locations) l AS
> location WHERE location.number = 2300").collect()
> res0: Array[org.apache.spark.sql.Row] = Array([John])
>
> This will double show people who have more than one matching address.
>
> On Tue, Oct 28, 2014 at 5:52 PM, Corey Nolet <cjno...@gmail.com> wrote:
>
>> So it wouldn't be possible to have a json string like this:
>>
>> { "name":"John", "age":53, "locations": [{ "street":"Rodeo Dr",
>> "number":2300 }]}
>>
>> And query all people who have a location with number = 2300?
>>
>>
>>
>>
>> On Tue, Oct 28, 2014 at 5:30 PM, Michael Armbrust <mich...@databricks.com
>> > wrote:
>>
>>> On Tue, Oct 28, 2014 at 2:19 PM, Corey Nolet <cjno...@gmail.com> wrote:
>>>
>>>> Is it possible to select if, say, there was an addresses field that had
>>>> a json array?
>>>>
>>> You can get the Nth item by "address".getItem(0).  If you want to walk
>>> through the whole array look at LATERAL VIEW EXPLODE in HiveQL
>>>
>>>
>>
>>
>

Re: Selecting Based on Nested Values using Language Integrated Query Syntax

Reply via email to