Still it's not working with the use of my rank UDF. Below is the query I am using
Can anyone help me, what changes I need to make in my below sql query? CREATE TABLE IF NOT EXISTS TestingTable1 ( BUYER_ID BIGINT, ITEM_ID BIGINT, CREATED_TIME STRING ) *Find TOP 10 latest data (ITEM_ID, CREATED_TIME) for each BUYER_ID for yesterday's date by sorting the created_time in descending order.* So what I was thinking is that, with the use of this below query, I will be getting- SELECT * FROM TestingTable1 WHERE ORDER BY buyer_id, created_time DESC; All the BUYER_ID and its ITEM_ID corresponding to CREATED_TIME in Descending order, But how I can pick TOP 10 for each BUYER_ID.? This is my SQL Query that I am using with the use of rank UDF. SELECT buyer_id, item_id, created_time, rk FROM ( SELECT buyer_id, item_id, rank(item_id) as rk, created_time FROM testingtable1 DISTRIBUTE BY buyer_id, item_id SORT BY buyer_id, item_id, created_time desc ) a WHERE rk < 10 ORDER BY buyer_id, created_time, rk; *Raihan Jamal* On Tue, Jul 10, 2012 at 12:16 AM, Jasper Knulst <jasper.knu...@incentro.com>wrote: > Hi Raihan, > > You should use 'rank(buyer_id)' in the order by clause on line 9 in stead > of the alias 'rk'. I had the same problem, strangely, the alias is not > resolved when it is in the order by clause. > > Other thing, I had some issues when I used this exact same set up for > ranking results, that the rank UDF was already implemented at the map > phase. Then you get very strange results. You have to introduce an extra > subquery for it to work. > > Jasper > Op 10 jul. 2012 09:01 schreef "Raihan Jamal" <jamalrai...@gmail.com> het > volgende: > > I was not able to understand, This is my below qiuery that I am using >> currently- >> >> SELECT buyer_id, item_id, created_time >> FROM ( >> SELECT buyer_id, item_id, rank(buyer_id) as rank, created_time >> FROM testingtable1 >> DISTRIBUTE BY buyer_id, item_id >> SORT BY buyer_id, item_id, created_time desc >> ) a >> WHERE rank < 10 >> ORDER BY buyer_id, created_time, rank; >> >> >> What changes I need to make? >> >> >> >> >> *Raihan Jamal* >> >> >> >> On Mon, Jul 9, 2012 at 11:52 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: >> >>> try rk in upper select statement as well >>> >>> >>> On Tue, Jul 10, 2012 at 12:12 PM, Raihan Jamal <jamalrai...@gmail.com>wrote: >>> >>>> Thanks for commenting out. Yes I figured that out, its a UDF. So now I >>>> have created a new UDF Rank and added to classpath also. But when I am >>>> again running the below query- >>>> >>>> SELECT buyer_id, item_id, created_time >>>> FROM ( >>>> SELECT buyer_id, item_id, Rank(buyer_id) as rk, created_time >>>> FROM testingtable1 >>>> DISTRIBUTE BY buyer_id, item_id >>>> SORT BY buyer_id, item_id, created_time desc >>>> ) a >>>> WHERE rk < 10 >>>> ORDER BY buyer_id, rk; >>>> >>>> I am getting this below error- >>>> >>>> *FAILED: Error in semantic analysis: line 9:19 Invalid Table Alias or >>>> Column Reference rk* >>>> >>>> >>>> Why is it so? Any suggestions? >>>> >>>> >>>> *Raihan Jamal* >>>> >>>> >>>> >>>> On Mon, Jul 9, 2012 at 10:51 PM, Vijay <tec...@gmail.com> wrote: >>>> >>>>> hive has no built-in rank function. you'd need to use a user-defined >>>>> function (UDF) to simulate it. there are a few custom implementations >>>>> on the net that you can leverage. >>>>> >>>>> On Mon, Jul 9, 2012 at 10:40 PM, Raihan Jamal <jamalrai...@gmail.com> >>>>> wrote: >>>>> > What's wrong with the below query. >>>>> > >>>>> > >>>>> > SELECT buyer_id, item_id, created_time >>>>> > FROM ( >>>>> > SELECT buyer_id, item_id, rank(buyer_id) as rank, created_time >>>>> > FROM testingtable1 >>>>> > DISTRIBUTE BY buyer_id, item_id >>>>> > SORT BY buyer_id, item_id, created_time desc >>>>> > ) a >>>>> > WHERE rank < 10 >>>>> > ORDER BY buyer_id, rank >>>>> > >>>>> > >>>>> > I am always getting as - >>>>> > >>>>> > FAILED: Error in semantic analysis: line 3:30 Invalid Function rank >>>>> > >>>>> > >>>>> > I am using Hive 0.6 I guess. >>>>> > >>>>> > >>>>> > >>>>> > Raihan Jamal >>>>> > >>>>> >>>> >>>> >>> >>> >>> -- >>> Nitin Pawar >>> >>> >>