Modified Query that I wrote and its not working as expected output is.

*
*
*SELECT bid, pid, rank(bid), time, UNIX_TIMESTAMP(time)*
*FROM (*
*    SELECT bid, pid, time*
*    FROM table1*
* where to_date(from_unixtime(cast(UNIX_TIMESTAMP(time) as int))) =
'2012-07-09'*
*    DISTRIBUTE BY bid,pid,time*
*    SORT BY bid, time desc*
*) a*
*WHERE rank(bid) < 3;*





On Thu, Jul 19, 2012 at 1:41 PM, comptech geeky <comptechge...@gmail.com>wrote:

> I wrote this query after modifying it-
>
> *SELECT buyer_id, item_id, rank(buyer_id), created_time,
> UNIX_TIMESTAMP(created_time)*
> *FROM (*
> *    SELECT buyer_id, item_id, created_time*
> *    FROM testingtable1*
> * where to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) as int)))
> = '2012-07-09'*
> *    DISTRIBUTE BY buyer_id,item_id*
> *    SORT BY buyer_id, created_time desc*
> *) a*
> *WHERE rank(buyer_id) < 3;*
>
> And the output I got is which is sligtly wrong as it is missing two rows-
>
> *1345653    330760137950       2012-07-09 21:43:29*
> *1345653    330760137950       2012-07-09 21:42:29*
> *1345653    330760137950       2012-07-09 21:41:29*
>
> These two rows are missing-
>
> 1345653    110909316904       2012-07-09 21:29:06
> 1345653    221065796761       2012-07-09 19:31:48
>
> So full output should be like this-
>
> *1345653    330760137950       2012-07-09 21:43:29*
> *1345653    330760137950       2012-07-09 21:42:29*
> *1345653    330760137950       2012-07-09 21:41:29*
> 1345653    110909316904       2012-07-09 21:29:06
> 1345653    221065796761       2012-07-09 19:31:48
>
>
> On Thu, Jul 19, 2012 at 1:29 PM, comptech geeky 
> <comptechge...@gmail.com>wrote:
>
>> Can you show me the exact query that I need to do for this particular
>> problem consideing my scenario? It will be of great help to me. As I am new
>> to HiveQL.
>>
>> I need TOP 3 for those if BID and PID gets matched but with different
>> timestamp.
>>
>>
>>
>> On Thu, Jul 19, 2012 at 1:15 PM, Philip Tromans <
>> philip.j.trom...@gmail.com> wrote:
>>
>>> Your rank() is being evaluated map side. Put your distribute by and sort
>>> by in an inner query, and then evaluate your rank() in an outer query.
>>>
>>> Phil.
>>> On Jul 19, 2012 9:00 PM, "comptech geeky" <comptechge...@gmail.com>
>>> wrote:
>>>
>>>> This is the below data in my Table1
>>>>
>>>>
>>>> BID       PID                       TIME
>>>>
>>>> --------------+-------------------------+--------------------------------
>>>> 1345653   330760137950       2012-07-09 21:42:29
>>>> 1345653   330760137950       2012-07-09 21:43:29
>>>> 1345653   330760137950       2012-07-09 21:40:29
>>>> 1345653   330760137950       2012-07-09 21:41:29
>>>> 1345653   110909316904       2012-07-09 21:29:06
>>>> 1345653   221065796761       2012-07-09 19:31:48
>>>>
>>>> So If I need to clarify the above scenario- I have data in above table
>>>> like this-
>>>> For USER *`1345653` *I have this PID `*330760137950` *four times but
>>>> with different timestamps in red color. So I need the output something like
>>>> this-
>>>>
>>>> Output that I need:-
>>>>
>>>> *1345653    330760137950       2012-07-09 21:43:29 *
>>>> *1345653    330760137950       2012-07-09 21:42:29 *
>>>> *1345653    330760137950       2012-07-09 21:41:29*
>>>> 1345653    110909316904       2012-07-09 21:29:06
>>>> 1345653    221065796761       2012-07-09 19:31:48
>>>>
>>>> So Basically If BID and PID are same but with different timestamps,
>>>> then I need TOP 3 sorted with TIME in descending order
>>>>
>>>> And for this I created rank UDF (User Defined Function). And I wrote
>>>> the below query but its not working for me. Can anyone help me on this?
>>>> *
>>>> *
>>>> *
>>>> *
>>>> *SELECT buyer_id, item_id, created_time*
>>>> *    FROM table1*
>>>> * where to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) as
>>>> int))) = '2012-07-09'*
>>>> *    DISTRIBUTE BY buyer_id*
>>>> *    SORT BY buyer_id, created_time desc*
>>>>
>>>>
>>
>

Reply via email to