Hi all, Below are some of observations based on the on-going rank function discussion.
1. I executed below mentioned queries and only the query with "rank" (lowercase) executed successfully, rest were throwing exceptions "FAILED: SemanticException Failed to breakup Windowing invocations into Groups." - select cust_id, ord_dt, RANK() w from cust_ord window w as (partition by cust_id order by ord_dt); - select cust_id, ord_dt, Rank() w from cust_ord window w as (partition by cust_id order by ord_dt); - select cust_id, ord_dt, rank() w from cust_ord window w as (partition by cust_id order by ord_dt); It seems "rank" keyword is case-sensitive. Attached is the screenshot for reference. 2. I created a dummy table with the data provided in the below mail trail and achieved the expected output, using the below mentioned query. *select cust_id, ord_dt, rank() over (partition by cust_id order by ord_dt) from cust_ord;* Request all to kindly review these details and suggest if it was of any help! Thanks. On Sat, Jul 27, 2013 at 12:07 AM, j.barrett Strausser < j.barrett.straus...@gmail.com> wrote: > Any further help on this, otherwise I'll file a jira. > > > On Wed, Jul 24, 2013 at 11:32 PM, j.barrett Strausser < > j.barrett.straus...@gmail.com> wrote: > >> As an example : If I run my query above removing the arg the following is >> thrown. >> >> FAILED: SemanticException Failed to breakup Windowing invocations into >> Groups. At least 1 group must only depend on input columns. Also check for >> circular dependencies. >> Underlying error: >> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more >> arguments are expected. >> >> >> Similar issue and fix here: >> >> http://www.marshut.com/rqvpz/use-rank-over-partition-function-in-hive-11.html >> >> Even if it didn't require an arg it still doesn't explain my anomalous >> output. >> >> >> >> On Wed, Jul 24, 2013 at 11:28 PM, j.barrett Strausser < >> j.barrett.straus...@gmail.com> wrote: >> >>> That isn't true. If you try to run the above HIVE without an argument, >>> it will throw an exception. I have seen other users replicate this problem >>> as well. >>> >>> I can file a JIRA if someone can confirm that my query should work. >>> >>> >>> On Wed, Jul 24, 2013 at 11:02 PM, manishbh...@rocketmail.com < >>> manishbh...@rocketmail.com> wrote: >>> >>>> Analytical function doesn't expect any argument. Rank() itself enough >>>> to sequence based on the window you have defined in partition by. So >>>> >>>> Rank() over (partition by cmscustid order by orderdate) >>>> >>>> Should work as long as I have wrote right syntax for hive. >>>> >>>> Sent via Rocket from my HTC >>>> >>>> ----- Reply message ----- >>>> From: "j.barrett Strausser" <j.barrett.straus...@gmail.com> >>>> To: <user@hive.apache.org> >>>> Subject: Semantics of Rank. >>>> Date: Thu, Jul 25, 2013 1:08 AM >>>> >>>> >>>> Thanks for the reply. Perhaps my misunderstanding of the relation >>>> between >>>> rank and the windowing function is wrong. >>>> >>>> What I want to achieve for the following is : For a given customer id, >>>> sort his orders. I thought the below would work. >>>> >>>> SELECT eh.cmsorderid, eh.orderdate, RANK(orderdate) w FROM order_data eh >>>> window w as (partition by cmscustid order by orderdate); >>>> >>>> The rank function instead returns the rank of the order date over all >>>> all >>>> order dates. >>>> >>>> Example snippet from above >>>> >>>> Actual : >>>> >>>> 6758783 27APR2012 94 >>>> 6758783 23JUN2012 95 >>>> 6758785 14DEC2012 96 >>>> 6758795 18DEC2011 97 >>>> 6758796 06MAY2012 98 >>>> 6758798 24MAR2013 99 >>>> 6758799 23NOV2012 100 >>>> >>>> >>>> Expected : >>>> >>>> 6758783 27APR2012 1 >>>> 6758783 23JUN2012 2 >>>> 6758785 14DEC2012 1 >>>> 6758795 18DEC2011 1 >>>> 6758796 06MAY2012 1 >>>> 6758798 24MAR2013 1 >>>> 6758799 23NOV2012 1 >>>> >>>> >>>> -b >>>> >>>> >>>> >>>> >>>> On Wed, Jul 24, 2013 at 3:17 PM, Shahar Glixman <sglix...@outbrain.com >>>> >wrote: >>>> >>>> > the argument to rank is simply some value, whereas the rank function >>>> > compare this value >>>> > to the previous value received, if value is same, rank returns >>>> ++index, >>>> > otherwise, rank return 1. >>>> > pseudo code: >>>> > >>>> > class Rank { >>>> > int index; >>>> > Object previousValue = null; >>>> > int evaluate(Object value) { >>>> > if (value == previousValue) { >>>> > return ++index; >>>> > } >>>> > previousValue = value; >>>> > index = 1; >>>> > return 1; >>>> > } >>>> > >>>> > >>>> > On Wed, Jul 24, 2013 at 9:59 PM, j.barrett Strausser < >>>> > j.barrett.straus...@gmail.com> wrote: >>>> > >>>> >> It seems as though I am required to pass in an argument to RANK(). >>>> >> >>>> >> What is the effect of passing this argument in ? >>>> >> >>>> >> In a RANK function the output for RANK should be the number of rows >>>> >> preceding >>>> >> a row in a given window. As windows are specified by the partition >>>> and >>>> >> order by fields I don't understand the effect of passing in an arg >>>> to RANK >>>> >> >>>> >> Are the other non-aggregate function similar? Meaning : Cume_dist, >>>> Ntile >>>> >> -- >>>> >> >>>> >> >>>> >> https://github.com/bearrito >>>> >> @deepbearrito >>>> >> >>>> > >>>> > >>>> > The above terms reflect a potential business arrangement, are >>>> provided solely >>>> > as a basis for further discussion, and are not intended to be and do >>>> not >>>> > constitute a legally binding obligation. No legally binding >>>> obligations will >>>> > be created, implied, or inferred until an agreement in final form is >>>> executed >>>> > in writing by all parties involved. >>>> > >>>> > This email and any attachments hereto may be confidential or >>>> privileged. >>>> > If you received this communication by mistake, please don't forward >>>> it >>>> > to anyone else, please erase all copies and attachments, and please >>>> let >>>> > me know that it has gone to the wrong person. Thanks. >>>> > >>>> >>>> >>>> >>>> -- >>>> >>>> >>>> https://github.com/bearrito >>>> @deepbearrito >>>> >>> >>> >>> >>> -- >>> >>> >>> https://github.com/bearrito >>> @deepbearrito >>> >> >> >> >> -- >> >> >> https://github.com/bearrito >> @deepbearrito >> > > > > -- > > > https://github.com/bearrito > @deepbearrito >
<<attachment: rank-fun.png>>