I am not sure about this but you defined a rank function and then aliased the column as rank itself. Can this cause the issue?
>From the error it looks like it could not figure out which column is rank in the where clause "WHERE rank > 1" from Edward's blog the correct query looks similar but his function is named as p_rank SELECT category,country,product,sales,rank FROM ( SELECT category,country,product,sales, p_rank(category, country) rank FROM ( SELECT category,country,product, sales FROM p_rank_demo DISTRIBUTE BY category,country SORT BY category,country,sales desc) t1) t2 WHERE rank <= 3 On Tue, Apr 2, 2013 at 10:45 PM, Keith Wiley <kwi...@keithwiley.com> wrote: > I did all that, I just didn't paste it into the email. That isn't the > problem. Sorry for the confusion. > > hive> add jar Rank.jar; > hive> create temporary function rank as 'com.example.hive.udf.Rank'; > > BTW, small typo, the condition at the end uses less-then, not > greater-then...obviously...but that isn't the problem either. > > On Apr 2, 2013, at 10:06 , Nitin Pawar wrote: > > > I dont see you added rank in the classpath and defined rank function > > > > can you try add jar and define the function and try again > > > > > > > > On Tue, Apr 2, 2013 at 10:33 PM, Keith Wiley <kwi...@keithwiley.com> > wrote: > > On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote: > > > > > > http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/ > > > > Is there any possibility there is a bug in Ritesh Agrawal's query > statement from that article? I created a test table with the exact column > names from the example in the article and used a minimally altered version > of the command (I removed the where clause to simplify things a bit) and > got an error which suggests there is something slightly wrong with the > command (or perhaps the table has to be configured a special way). Here's > what I get when I almost perfectly duplicate that example: > > > > hive> describe test; > > OK > > user string > > category string > > value int > > Time taken: 0.082 seconds > > ================================================== > > hive> select * from test; > > OK > > user1 cat1 1 > > user1 cat1 2 > > user1 cat1 3 > > user1 cat2 10 > > user1 cat2 20 > > user1 cat2 30 > > user2 cat1 11 > > user2 cat1 21 > > user2 cat1 31 > > user2 cat2 5 > > user2 cat2 6 > > user2 cat2 7 > > Time taken: 0.202 seconds > > ================================================== > > hive> SELECT user, category, value > > > FROM ( > > > SELECT user, category, rank(user) as rank, value > > > FROM test > > > DISTRIBUTE BY user > > > SORT BY user, value desc > > > ) a > > > WHERE rank > 1 > > > ORDER BY user, rank; > > FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or > column reference 'rank': (possible column names are: _col0, _col1, _col2) > > hive> > > > > > ________________________________________________________________________________ > > Keith Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > > > > "What I primarily learned in grad school is how much I *don't* know. > > Consequently, I left grad school with a higher ignorance to knowledge > ratio than > > when I entered." > > -- Keith Wiley > > > ________________________________________________________________________________ > > > > > > > > > > -- > > Nitin Pawar > > > > ________________________________________________________________________________ > Keith Wiley kwi...@keithwiley.com keithwiley.com > music.keithwiley.com > > "It's a fine line between meticulous and obsessive-compulsive and a > slippery > rope between obsessive-compulsive and debilitatingly slow." > -- Keith Wiley > > ________________________________________________________________________________ > > -- Nitin Pawar