Hello, I'm trying to use GenericUDFRank described in: https://issues.apache.org/jira/browse/HIVE-2361, however, no matter the query I use, the result is not what I expected. Assume a user hive table with the format: Country, City, userId
I'm running the following query: ADD JAR Rank.jar; CREATE TEMPORARY FUNCTION rank AS 'com.nexr.platform.analysis.udf.GenericUDFRank'; SELECT Country, City, rank(userId) FROM myTable DISTRIBUTE BY Country, City SORT BY Country, City userId; For the following table: US NY 8 US NY 12 US NY 3 US NJ 10 US NJ 26 I'm expecting the following result: US NY 1 US NY 2 US NY 3 US NJ 1 US NJ 2 But I get: US NY 1 US NY 1 US NY 1 US NJ 1 US NJ 1 I used also a different rank implementation ( http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/doing_rank_with_hive) but results were similar. I guess I'm using the UDF the wrong way, but I cant find the correct way. Any help is appreciated. thanks -- The above terms reflect a potential business arrangement, are provided solely as a basis for further discussion, and are not intended to be and do not constitute a legally binding obligation. No legally binding obligations will be created, implied, or inferred until an agreement in final form is executed in writing by all parties involved. This email and any attachments hereto may be confidential or privileged. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person. Thanks.