On Thu, May 24, 2012 at 2:19 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
> Hive is not SQL 92 compliant or whatever. > > https://cwiki.apache.org/Hive/languagemanual.html > > in particular you can not do subselects inside the in or the where > clause. Hive usually have other formulations like left semi join that > makes things 'like in' and 'not in' possible. > > Thanks. But what I am looking for is to select only those rows that are of min(t_timestamp) for a given a_id. What would be the best way? I guess do some kind of group by and store it in intermediate file and run another select on it? > Edward > On Thu, May 24, 2012 at 5:13 PM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: > > I am now trying to do it this way but doesn't work in hive. I think I am > > missing something here, can someone please help? > > > > select a_id from web_data t1 where a_id = (select min(a_id) from > web_data t2 > > where t2.t_timestamp = t1.t_timestamp) > > > > I get: > > > > > > FAILED: Parse Error: line 1:69 cannot recognize input near 'select' 'min' > > '(' in expression specification > > > > > > > > On Thu, May 24, 2012 at 1:02 PM, Mohit Anchlia <mohitanch...@gmail.com> > > wrote: > >> > >> I am new to Hive. I have several SQL from RDBMS database that I need to > >> convert to hive. What's the best reference for HIVEQL? For now I am > trying > >> to figure out how to do this in hive: > >> > >> Select distinct A_ID, First_Value(path IGNORE NULLS) over(PARTITION BY > >> A_ID ORDER BY t_timestamp) From WEB_DATA > >> > >> Any help would be appreciated. > > > > > > >