I filed HIVE-5038. My issue was that it seemed incorrect that rank took an arg. I haven't verified that the issue is fixed in trunk as claimed in the JIRA.However, between the reponse in JIRA and in this thread it seems resolved.
On Tue, Sep 3, 2013 at 4:23 AM, Lefty Leverenz <leftylever...@gmail.com>wrote: > Another email thread led me to > HIVE-5038<https://issues.apache.org/jira/browse/HIVE-5038>("rank operator is > case-sensitive and has odd semantics") -- it's resolved > as invalid, but is that only for the odd semantics? > > Perhaps this issue is clarified in more recent emails. I'm catching up on > a huge backlog. > > -- Lefty > > > On Tue, Sep 3, 2013 at 4:03 AM, Lefty Leverenz <leftylever...@gmail.com>wrote: > >> What's the answer -- does the "rank" keyword have to be lowercase? >> >> If lowercase is obligatory we need to revise the wiki, which shows all >> uppercase ( >> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics >> ). >> >> In the test files it's lowercase (windowing_rank.q, >> ptf_negative_WhereWithRankCond.q). >> The patch for HIVE-896 shows a lowercase name in GenericUDAFRank.java but >> I don't know if that means lowercase is required: >> >> @WindowFunctionDescription >>> >>> ( >>> >>> description = @Description( >>> >>> name = "rank", >>> >>> value = "_FUNC_(x)" >>> >>> ), >>> >>> supportsWindow = false, >>> >>> pivotResult = true >>> >>> ) >>> >> >> >> And what about the other keywords in the wikidoc? Same lowercase >> requirement? >> >> -- Lefty >> >> >> On Fri, Jul 26, 2013 at 5:30 PM, saurabh <mpp.databa...@gmail.com> wrote: >> >>> Hi all, >>> >>> Below are some of observations based on the on-going rank function >>> discussion. >>> >>> 1. I executed below mentioned queries and only the query with "rank" >>> (lowercase) executed successfully, rest were throwing exceptions "FAILED: >>> SemanticException Failed to breakup Windowing invocations into Groups." >>> >>> - select cust_id, ord_dt, RANK() w from cust_ord window w as (partition >>> by cust_id order by ord_dt); >>> >>> - select cust_id, ord_dt, Rank() w from cust_ord window w as (partition >>> by cust_id order by ord_dt); >>> >>> - select cust_id, ord_dt, rank() w from cust_ord window w as >>> (partition by cust_id order by ord_dt); >>> >>> It seems "rank" keyword is case-sensitive. Attached is the screenshot >>> for reference. >>> >>> 2. I created a dummy table with the data provided in the below mail >>> trail and achieved the expected output, using the below mentioned query. >>> >>> *select cust_id, ord_dt, rank() over (partition by cust_id order by >>> ord_dt) from cust_ord;* >>> >>> Request all to kindly review these details and suggest if it was of any >>> help! >>> >>> Thanks. >>> >>> >>> On Sat, Jul 27, 2013 at 12:07 AM, j.barrett Strausser < >>> j.barrett.straus...@gmail.com> wrote: >>> >>>> Any further help on this, otherwise I'll file a jira. >>>> >>>> >>>> On Wed, Jul 24, 2013 at 11:32 PM, j.barrett Strausser < >>>> j.barrett.straus...@gmail.com> wrote: >>>> >>>>> As an example : If I run my query above removing the arg the following >>>>> is thrown. >>>>> >>>>> FAILED: SemanticException Failed to breakup Windowing invocations into >>>>> Groups. At least 1 group must only depend on input columns. Also check for >>>>> circular dependencies. >>>>> Underlying error: >>>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more >>>>> arguments are expected. >>>>> >>>>> >>>>> Similar issue and fix here: >>>>> >>>>> http://www.marshut.com/rqvpz/use-rank-over-partition-function-in-hive-11.html >>>>> >>>>> Even if it didn't require an arg it still doesn't explain my anomalous >>>>> output. >>>>> >>>>> >>>>> >>>>> On Wed, Jul 24, 2013 at 11:28 PM, j.barrett Strausser < >>>>> j.barrett.straus...@gmail.com> wrote: >>>>> >>>>>> That isn't true. If you try to run the above HIVE without an >>>>>> argument, it will throw an exception. I have seen other users replicate >>>>>> this problem as well. >>>>>> >>>>>> I can file a JIRA if someone can confirm that my query should work. >>>>>> >>>>>> >>>>>> On Wed, Jul 24, 2013 at 11:02 PM, manishbh...@rocketmail.com < >>>>>> manishbh...@rocketmail.com> wrote: >>>>>> >>>>>>> Analytical function doesn't expect any argument. Rank() itself >>>>>>> enough to sequence based on the window you have defined in partition >>>>>>> by. So >>>>>>> >>>>>>> Rank() over (partition by cmscustid order by orderdate) >>>>>>> >>>>>>> Should work as long as I have wrote right syntax for hive. >>>>>>> >>>>>>> Sent via Rocket from my HTC >>>>>>> >>>>>>> ----- Reply message ----- >>>>>>> From: "j.barrett Strausser" <j.barrett.straus...@gmail.com> >>>>>>> To: <user@hive.apache.org> >>>>>>> Subject: Semantics of Rank. >>>>>>> Date: Thu, Jul 25, 2013 1:08 AM >>>>>>> >>>>>>> >>>>>>> Thanks for the reply. Perhaps my misunderstanding of the relation >>>>>>> between >>>>>>> rank and the windowing function is wrong. >>>>>>> >>>>>>> What I want to achieve for the following is : For a given customer >>>>>>> id, >>>>>>> sort his orders. I thought the below would work. >>>>>>> >>>>>>> SELECT eh.cmsorderid, eh.orderdate, RANK(orderdate) w FROM >>>>>>> order_data eh >>>>>>> window w as (partition by cmscustid order by orderdate); >>>>>>> >>>>>>> The rank function instead returns the rank of the order date over >>>>>>> all all >>>>>>> order dates. >>>>>>> >>>>>>> Example snippet from above >>>>>>> >>>>>>> Actual : >>>>>>> >>>>>>> 6758783 27APR2012 94 >>>>>>> 6758783 23JUN2012 95 >>>>>>> 6758785 14DEC2012 96 >>>>>>> 6758795 18DEC2011 97 >>>>>>> 6758796 06MAY2012 98 >>>>>>> 6758798 24MAR2013 99 >>>>>>> 6758799 23NOV2012 100 >>>>>>> >>>>>>> >>>>>>> Expected : >>>>>>> >>>>>>> 6758783 27APR2012 1 >>>>>>> 6758783 23JUN2012 2 >>>>>>> 6758785 14DEC2012 1 >>>>>>> 6758795 18DEC2011 1 >>>>>>> 6758796 06MAY2012 1 >>>>>>> 6758798 24MAR2013 1 >>>>>>> 6758799 23NOV2012 1 >>>>>>> >>>>>>> >>>>>>> -b >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 24, 2013 at 3:17 PM, Shahar Glixman < >>>>>>> sglix...@outbrain.com>wrote: >>>>>>> >>>>>>> > the argument to rank is simply some value, whereas the rank >>>>>>> function >>>>>>> > compare this value >>>>>>> > to the previous value received, if value is same, rank returns >>>>>>> ++index, >>>>>>> > otherwise, rank return 1. >>>>>>> > pseudo code: >>>>>>> > >>>>>>> > class Rank { >>>>>>> > int index; >>>>>>> > Object previousValue = null; >>>>>>> > int evaluate(Object value) { >>>>>>> > if (value == previousValue) { >>>>>>> > return ++index; >>>>>>> > } >>>>>>> > previousValue = value; >>>>>>> > index = 1; >>>>>>> > return 1; >>>>>>> > } >>>>>>> > >>>>>>> > >>>>>>> > On Wed, Jul 24, 2013 at 9:59 PM, j.barrett Strausser < >>>>>>> > j.barrett.straus...@gmail.com> wrote: >>>>>>> > >>>>>>> >> It seems as though I am required to pass in an argument to RANK(). >>>>>>> >> >>>>>>> >> What is the effect of passing this argument in ? >>>>>>> >> >>>>>>> >> In a RANK function the output for RANK should be the number of >>>>>>> rows >>>>>>> >> preceding >>>>>>> >> a row in a given window. As windows are specified by the >>>>>>> partition and >>>>>>> >> order by fields I don't understand the effect of passing in an >>>>>>> arg to RANK >>>>>>> >> >>>>>>> >> Are the other non-aggregate function similar? Meaning : >>>>>>> Cume_dist, Ntile >>>>>>> >> -- >>>>>>> >> >>>>>>> >> >>>>>>> >> https://github.com/bearrito >>>>>>> >> @deepbearrito >>>>>>> >> >>>>>>> > >>>>>>> > >>>>>>> > The above terms reflect a potential business arrangement, are >>>>>>> provided solely >>>>>>> > as a basis for further discussion, and are not intended to be and >>>>>>> do not >>>>>>> > constitute a legally binding obligation. No legally binding >>>>>>> obligations will >>>>>>> > be created, implied, or inferred until an agreement in final form >>>>>>> is executed >>>>>>> > in writing by all parties involved. >>>>>>> > >>>>>>> > This email and any attachments hereto may be confidential or >>>>>>> privileged. >>>>>>> > If you received this communication by mistake, please don't >>>>>>> forward it >>>>>>> > to anyone else, please erase all copies and attachments, and >>>>>>> please let >>>>>>> > me know that it has gone to the wrong person. Thanks. >>>>>>> > >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> https://github.com/bearrito >>>>>>> @deepbearrito >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> https://github.com/bearrito >>>>>> @deepbearrito >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> https://github.com/bearrito >>>>> @deepbearrito >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> https://github.com/bearrito >>>> @deepbearrito >>>> >>> >>> >> > -- https://github.com/bearrito @deepbearrito