On Thu, May 24, 2012 at 3:10 PM, Gesli, Nicole
<nicole.ge...@memorylane.com>wrote:

>  Try this:
>
>  SELECT a.a_id, b.path
> FROM ( SELECT a_id, MIN(t_timestamp) t_timestamp
>        FROM   web_data
>        GROUP BY a_id
>      ) a JOIN
>        web_data b ON ( b.a_id = a.a_id AND b.t_timestamp = a.t_timestamp )
>

Awesome that works. I didn't realize I could do it this way. Thanks for
your help and others as well.

>
> -Nicole
>
> From: Roberto Sanabria <robe...@stumbleupon.com>
> Reply-To: <user@hive.apache.org>
> Date: Thu, 24 May 2012 15:06:29 -0700
> To: <user@hive.apache.org>
> Subject: Re: SQL help
>
> "I guess do some kind of group by and store it in intermediate file and
> run another select on it?"
>
> Yes, that is my recommendation.
>
> On Thu, May 24, 2012 at 2:57 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote:
>
>>
>>
>>  On Thu, May 24, 2012 at 2:19 PM, Edward Capriolo 
>> <edlinuxg...@gmail.com>wrote:
>>
>>> Hive is not SQL 92 compliant or whatever.
>>>
>>> https://cwiki.apache.org/Hive/languagemanual.html
>>>
>>> in particular you can not do subselects inside the in or the where
>>> clause. Hive usually have other formulations like left semi join that
>>> makes things 'like in' and 'not in' possible.
>>>
>>> Thanks. But what I am looking for is to select only those rows that are
>> of min(t_timestamp) for a given a_id. What would be the best way? I guess
>> do some kind of group by and store it in intermediate file and run another
>> select on it?
>>
>>
>>> Edward
>>>  On Thu, May 24, 2012 at 5:13 PM, Mohit Anchlia <mohitanch...@gmail.com>
>>> wrote:
>>> > I am now trying to do it this way but doesn't work in hive. I think I
>>> am
>>> > missing something here, can someone please help?
>>> >
>>> > select a_id from web_data t1 where a_id = (select min(a_id) from
>>> web_data t2
>>> > where t2.t_timestamp = t1.t_timestamp)
>>> >
>>> > I get:
>>> >
>>> >
>>> > FAILED: Parse Error: line 1:69 cannot recognize input near 'select'
>>> 'min'
>>> > '(' in expression specification
>>> >
>>> >
>>> >
>>> > On Thu, May 24, 2012 at 1:02 PM, Mohit Anchlia <mohitanch...@gmail.com
>>> >
>>> > wrote:
>>> >>
>>> >> I am new to Hive. I have several SQL from RDBMS database that I need
>>> to
>>> >> convert to hive. What's the best reference for HIVEQL? For now I am
>>> trying
>>> >> to figure out how to do this in hive:
>>> >>
>>> >> Select  distinct A_ID, First_Value(path IGNORE NULLS) over(PARTITION
>>> BY
>>> >> A_ID ORDER BY t_timestamp) From WEB_DATA
>>> >>
>>> >> Any help would be appreciated.
>>> >
>>> >
>>> >
>>>
>>
>>
>

Reply via email to