like -MM-DD.
>>For example, for "2-oct-2013" it will be 2013-10-02.
>>
>>
>>Best Regards,
>>Nishant Kelkar
>>
>>
>>
>>
>>
>>On Wed, Sep 10, 2014 at 11:48 AM, Raj Hadoop wrote:
>>
>>The
>>>
>>
The
>>
>> SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
>>
>> is returning the lowest date. I need the largest date.
>>
>>
>>
>>
>> On Wed, 9/10/14, Raj Hadoop wrote:
>>
>> Subjec
ed, Sep 10, 2014 at 11:48 AM, Raj Hadoop wrote:
>
> The
>
> SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
>
> is returning the lowest date. I need the largest date.
>
>
>
> --------
> On Wed, 9/10/14, Raj Hadoop wrote:
>
&g
---
>On Wed, 9/10/14, Raj Hadoop wrote:
>
> Subject: Re: Remove duplicate records in Hive
> To: user@hive.apache.org
> Date: Wednesday, September 10, 2014, 2:41 PM
>
>
> Thanks. I will try it.
> --------
&g
0] AS latest_date
>
> is returning the lowest date. I need the largest date.
>
>
>
>
> On Wed, 9/10/14, Raj Hadoop wrote:
>
> Subject: Re: Remove duplicate records in Hive
> To: user@hive.apache.org
> Date: Wednesda
The
SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
is returning the lowest date. I need the largest date.
On Wed, 9/10/14, Raj Hadoop wrote:
Subject: Re: Remove duplicate records in Hive
To: user@hive.apache.org
Date: Wednesday, September 10
Thanks. I will try it.
On Wed, 9/10/14, Nishant Kelkar wrote:
Subject: Re: Remove duplicate records in Hive
To: user@hive.apache.org, hadoop...@yahoo.com
Date: Wednesday, September 10, 2014, 1:59 PM
Hi
Raj,
You can do something
along these
Hi Raj,
You can do something along these lines:
SELECT cno, sqno, SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date FROM
table GROUP BY cno, sqno;
However, you have to make sure your date format is such that sorting it
gives you the most recent date. The best way to do that is to have it in
format
Whoops, thought this was someone in my office, so obviously you can’t come see
me :)
--
Kevin Weiler
IT
IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 |
http://imc-chicago.com/
Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
kevin.wei...@imc-chicago.com
If you can just query the table for your results, you can do a SELECT DISTINCT
instead of just a SELECT. If you give me a bit more information about where the
duplicate data is coming from, I can provide a bit more detail. You can come
see me on the end of desk.
--
Kevin Weiler
IT
IMC Financial
10 matches
Mail list logo